lecture_notes/tdd/tdd.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656

=======================
Test Driven Development
=======================

What is TDD?
============

Objectives
----------
At the end of this section, you will be able to:

1. Write your code using the TDD paradigm. 
#. Use doctests to test your Python code. 
#. Use unittests to test your Python code. 
#. Use the nose module to test your code. 


Test Driven Development (TDD), as the name suggests, is a style or method
of software development based on the idea of writing code, after writing
tests for them. As the tests are written for code, that doesn't even exist
yet, it is bound to fail. Actual code is later written, to pass the test
and later on refactored.

The basic steps of TDD are roughly as follows - 

1. Decide upon the feature to implement and the methodology of testing it. 
#. Write the tests for the feature decided upon. 
#. Just write enough code, so that the test can be run, but it fails. 
#. Improve the code, to just pass the test and at the same time passing all
 previous tests. 
#. Run the tests to see, that all of them run successfully. 
#. Refactor the code we've just written -- optimize the algorithm, remove
 duplication, add documentation, etc. 
#. Run the tests again, to see that all the tests still pass. 
#. Go back to 1. 

First "Test"
============

Now, that we have an overview of TDD, let's get down to writing our first
test. Let us consider a very simple program which returns the Greatest
Common Divisor (GCD) of two numbers. 

Before being able to write our test, we need to have a clear idea of the
code units our program will contain, so that we can test each of them.
Let's first define the code units that our GCD program is going to have. 

Our program is going to contain only one function called ``gcd``, which
will take in two arguments, and return a single value, which is the GCD of
the two arguments. So if we want to find out GCD of 44, 23, we call the
function, with the arguments 44 and 23. 

::

    c = gcd(44, 23) 

c will contain the GCD of the two numbers.

We have defined the code units in our program. So, we can go ahead and
write tests for it. 

When adopting the TDD methodology, it is important to have a clear set of
results defined for our code units i.e., for every given set of inputs as
that we wish use as a test case, we must have, before hand, the exact
outputs that are expected for those input test cases. We must not be
running around looking for outputs for our test cases after we have the
code ready, or even while we are writing the tests.

Let one of our test cases be 48 and 64 as ``a`` and ``b``, respectively.
For this test case we know that the expected output, the GCD, is 16. Let
our second test case be 44 and 19 as ``a`` and ``b``, respectively. We know
that their GCD is 1, and that is the expected output. 

But before writing one, we should What does a test look like? What does it
do? Let's answer this question first.

Tests are just a series of assertions which are either True or False
depending on the expected behaviour of the code and the actual behaviour.
Tests for out GCD function could be written as follows.

::

  tc1 = gcd(48, 64)
  if tc1 != 16:
      print "Failed for a=48, b=64. Expected 16. Obtained %d instead." % tc1
      exit(1)
  
  tc2 = gcd(44, 19)
  if tc2 != 1:
      print "Failed for a=44, b=19. Expected 1. Obtained %d instead." % tc2
      exit(1)

  print "All tests passed!"

The next step is to run the tests we have written, but if we run the tests
now, the tests wouldn't even run. Why? We don't even have the function
``gcd`` defined, for it to be called by the tests. The test code doesn't
even run! What is the way out? We shall first write a very minimal
definition (or a stub) of the function ``gcd``, so that it can be called by
the tests.

A minimal definition for the ``gcd`` function would look like this

::

    def gcd(a, b):
        pass

As you can see, the stub for ``gcd`` function does nothing, other than
define the required function which takes the two arguments a and b. No
action is done upon the arguments, passed to the function, yet. The
function definition is empty and just contains the ``pass`` statement. 

Let us put all these in a file and call this file ``gcd.py`` 

::

  def gcd(a, b):
      pass

  if __name__ == '__main__':
      tc1 = gcd(48, 64)
      if tc1 != 16:
          print "Failed for a=48 and b=64. Expected 16. Obtained %d instead." % tc1
          exit(1)

      tc2 = gcd(44, 19)
      if tc2 != 1:
          print "Failed for a=44 and b=19. Expected 1. Obtained %d instead." % tc2
          exit(1)

      print "All tests passed!"

Recall that, the condition ``__name__ == '__main__'`` becomes true, only
when the python script is ran directly as a stand-alone script. So any code
within this ``if`` block is executed only when the script is run as a
stand-alone script and doesn't run when it is used as a module and
imported.

Let us run our code as a stand-alone script.

::

  $ python gcd.py
  Traceback (most recent call last):
    File "gcd.py", line 7, in <module> print "Failed for a=48 and b=64. Expected 16. Obtained %d instead." % tc1
  TypeError: %d format: a number is required, not NoneType

We now have our tests, the code unit stub, and a failing test. The next
step is to write code, so that the test just passes. 

Let's us the algorithm given by a greek mathematician, 2300 years ago, the
Euclidean algorithm, to compute the GCD of two numbers. 

**Note**: If you are unaware of Euclidean algorithm to compute the gcd of
two numbers please refer to it on wikipedia. It has a very detailed
explanation of the algorithm and its proof of validity among other things.

The ``gcd`` stub function can be modified to the following function, 

::

    def gcd(a, b):
        if a == 0:
            return b
        while b != 0:
            if a > b:
                a = a - b
            else:
                b = b - a
        return a

Now let us run our script which already has the tests written in it and see
what happens

::

    $ python gcd.py
    All tests passed!

Success! We managed to write code, that passes all the tests. The code we
wrote was simplistic, actually. If you take a closer look at the code you
will soon realize that the chain of subtraction operations can be replaced
by a modulo operation i.e. taking remainders of the division between the
two numbers since they are equivalent operations. The modulo operation is
far better, since it combines a lot of subtractions into one operation and
the reduction is much faster. Take a few examples, and convince yourself
that using the modulo operation is indeed faster than the subtraction
method.

::

    def gcd1(a, b):
        while b != 0 and a != 0:
            if a > b:
                a = a%b
            else:
                b = b%a
        if a == 0:
            return b
        else:
            return a

But we know that the modulo of ``a%b`` is always less than b. So, if the
condition ``a>b`` is True in this iteration, it will definitely be False in
the next iteration. Also note that , and in the case when ``a < b``,
``a%b`` is equal to ``a``. So, we could, essentially, get rid of the
``if-else`` block and combine a swapping operation along with the modulo
operation. 

::

    def gcd(a, b):
        while b != 0:
            a, b = b, a % b
        return a

Let's run our tests again, and check that all the tests are passed, again.

We could make one final
improvement to our ``gcd`` function, which is strictly not necessary in
terms of efficiency, but is definitely more readable and easy to
understand. We could make our function a recursive function, as shown below

::

  def gcd(a, b):
      if b == 0:
          return a
      return gcd(b, a%b)

Much shorter and sweeter! We again run all the tests, and check that they
are passed. It indeed does pass all the tests! 

But there is still one small problem. There is no way, for the users of
this function, to determine how to use it, how many arguments it takes and
what it returns, among other things. This is a handicap for the people
reading your code, as well. This is well written function, with no
documentation whatsoever. 

Let's add a docstring as shown,

::

    def gcd(a, b):
        """Returns the Greatest Common Divisor of the two integers
        passed as arguments.
  
        Args:
          a: an integer
          b: another integer
  
        Returns: Greatest Common Divisor of a and b
        """
        if b == 0:
            return a
        return gcd(b, a%b)

Now we have refactored our code enough to make it well written piece
of code. We have now successfully completed writing our first test, writing
the relevant code, ensuring that the tests are passed and refactoring the
code, until we are happy with the way it works and looks. 

Let's now build on what we have learnt so far and learn more about writing
tests and improve our methodology of writing tests and simplify the process
of writing them. 

Persistent Test Cases
---------------------

As already stated, tests should be predetermined and you should have your
test cases and their outputs ready, even before you write the first line of
code. This test data is repeatedly used in the process of writing code. So,
it makes sense to have the test data to be stored in some persistent format
like a database, a text file, a file of specific format like XML, etc.

Let us modify the test code for the GCD function and save our test data in
a text file. Let us decide upon the following format for the test data. 

  1. The file has multiple lines of test data.
  2. Each line in this file corresponds to a single test case.
  3. Each line consists of three comma separated values --

     i. First two coloumns are the integers for which the GCD has to
        be computed
     ii. Third coloumn is the expected GCD to the preceding two
         numbers.

Now, how do we modify our tests to use this test data? We read the file
line by line and parse it to obtain the required numbers whose GCD we wish
to calculate, and the expected output. We can now call the ``gcd`` function
with the correct arguments and verify the output with the expected output. 

Let us call our data file ``gcd_testcases.dat``

::

    if __name__ == '__main__':
        for line in open('gcd_testcases.dat'):
            values = line.split(', ')
            a = int(values[0])
            b = int(values[1])
            g = int(values[2])
  
            tc = gcd(a, b)
            if tc != g:
                print "Failed for a=%d and b=%d. Expected %d. Obtained %d instead." % (a, b, g, tc)
                exit(1)

        print "All tests passed!"

When we execute the gcd.py script again we will notice that the code passes
all the tests. 

Now, we have learnt how to write simple test cases and develop code based
on these test cases. We shall now move on to learn to use some testing
frameworks available for Python, which make some of the task easier. 

Python Testing Frameworks
=========================

Python provides two ways to test the code we have written. One of them is
the unittest framework and the the other is the doctest module. To start
with, let us discuss the doctest module.

doctest
~~~~~~~

As we have already discussed, a well written piece of code must always be
accompanied by documentation. We document functions or modules using
docstrings. In addition to writing a generic description of the function or
the module, we can also add examples of using these functions in the
interactive interpreter to the docstrings. 

Using the ``doctest`` module, we can pick up all such interactive session
examples, execute them, and determine if the documented piece of code runs,
as it was documented. 

The example below shows how to write ``doctests`` for our ``gcd`` function.

::

    def gcd(a, b):
        """Returns the Greatest Common Divisor of the two integers
        passed as arguments.
  
        Args:
          a: an integer
          b: another integer
  
        Returns: Greatest Common Divisor of a and b
  
        >>> gcd(48, 64)
        16
        >>> gcd(44, 19)
        1
        """
        if b == 0:
            return a
        return gcd(b, a%b)

That is all there is, to writing a ``doctest`` in Python. Now how do we use
the ``doctest`` module to execute these tests. That too, is fairly straight
forward. All we need to do is tell the doctest module to execute, using the
``doctest.testmod`` function. 

Let us place this piece of code at the same place where we placed our tests
earlier. So putting all these together we have our ``gcd.py`` module which
looks as follows

::

    def gcd(a, b):
        """Returns the Greatest Common Divisor of the two integers
        passed as arguments.
  
        Args:
          a: an integer
          b: another integer
  
        Returns: Greatest Common Divisor of a and b
  
        >>> gcd(48, 64)
        16
        >>> gcd(44, 19)
        1
        """
        if b == 0:
            return a
        return gcd(b, a%b)
  
    if __name__ == "__main__":
        import doctest
        doctest.testmod()

The ``testmod`` function automatically picks up all the docstrings that
have sample sessions from the interactive interpreter, executes them and
compares the output with the results as specified in the sample sessions.
If all the outputs match the expected outputs that have been documented, it
doesn't say anything. It complains only if the results don't match as
documented, expected results. 

When we execute this script as a stand-alone script we will get back the
prompt with no messages which means all the tests passed

::

    $ python gcd.py
    $ 

If we further want to get a more detailed report of the tests that were
executed we can run python with -v as the command line option to the script

::

  $ python gcd.py -v
  Trying:
      gcd(48, 64)
  Expecting:
    16
  ok
  Trying:
      gcd(44, 19)
  Expecting:
      1
  ok
  1 items had no tests:
      __main__
  1 items passed all tests:
     2 tests in __main__.gcd
  2 tests in 2 items.
  2 passed and 0 failed.
  Test passed.


**Note:** We can have the sample sessions as test cases as long as the
outputs of the test cases do not contain any blank lines. In such cases we
may have to use the exact string ``<BLANKLINE>``

For the sake of illustrating a failing test case, let us assume that we
made a small mistake in our code. Instead of returning ``a`` when b = 0, we
typed it as ``return b`` when b = 0. Now, all the GCDs returned will have
the value 0. The code looks as follows

::

    def gcd(a, b):
        """Returns the Greatest Common Divisor of the two integers
        passed as arguments.
  
        Args:
          a: an integer
          b: another integer
  
        Returns: Greatest Common Divisor of a and b
  
        >>> gcd(48, 64)
        16
        >>> gcd(44, 19)
        1
        """
        if b == 0:
            return b
        return gcd(b, a%b)

Executing this code snippet without -v option to the script

::

  $ python gcd.py
  **********************************************************************
  File "gcd.py", line 11, in __main__.gcd
  Failed example:
      gcd(48, 64)
  Expected:
      16
  Got:
      0
  **********************************************************************
  File "gcd.py", line 13, in __main__.gcd
  Failed example:
      gcd(44, 19)
  Expected:
      1
  Got:
      0
  **********************************************************************
  1 items had failures:
     2 of   2 in __main__.gcd
  ***Test Failed*** 2 failures.

The output clearly complains that there were exactly two test cases
that failed. If we want a more verbose report we can pass -v option to
the script. 

This is all there is, to using the ``doctest`` module in Python.
``doctest`` is extremely useful when we want to test each Python function
or module individually. 

For more information about the doctest module refer to the Python library
reference on doctest[0].

unittest framework
~~~~~~~~~~~~~~~~~~

We needn't go too far ahead, to start complaining that ``doctest`` is not
sufficient to write complicated tests especially when we want to automate
our tests, write tests that need to test for more convoluted code pieces.
For such scenarios, Python provides a ``unittest`` framework.

The ``unittest`` framework provides methods to efficiently automate tests,
easily initialize code and data for executing the specific tests, cleanly
shut them down once the tests are executed and easy ways of aggregating
tests into collections and better way of reporting the tests.

Let us continue testing our gcd function in the Python module named
``gcd.py``. The ``unittest`` framework expects us to subclass the
``TestCase`` class in ``unittest`` module and place all our test code as
methods of this class. The name of the test methods are expected to be
started with ``test_``, so that the test runner knows which methods are to
be executed as tests. We shall use the test cases supplied by
``gcd_testcases.dat``. 

Lastly, to illustrate the way to test Python code as a module, let's create
a new file called ``test_gcd.py`` following the same convention used to
name the test methods, and place our test code in it. 

::
  
    import gcd
    import unittest
  
    class TestGcdFunction(unittest.TestCase):
  
        def setUp(self):
            self.test_file = open('gcd_testcases.dat')
            self.test_cases = []
            for line in self.test_file:
                values = line.split(', ')
                a = int(values[0])
                b = int(values[1])
                g = int(values[2])
  
                self.test_cases.append([a, b, g])
  
        def test_gcd(self):
            for case in self.test_cases:
                a = case[0]
                b = case[1]
                g = case[2]
                self.assertEqual(gcd.gcd(a, b), g)
  
        def tearDown(self):
            self.test_file.close()
            del self.test_cases
  
    if __name__ == '__main__':
        unittest.main()

Please note that although we highly recommend writing docstrings for all
the classes, functions and modules, we have not done so, in this example to
keep it compact. Adding suitable docstrings wherever required is left to
you, as an exercise. 

It would be a waste, to read the test data file, each time we run a test
method. So, in the setUp method, we read all the test data and store it
into a list called test_cases, which is an attribute of the TestGCDFunction
class. In the tearDown method of the class we will delete this attribute to
free up the memory and close the opened file.

Our actual test code sits in the method which begins with the name
``test_``, as stated earlier, the ``test_gcd`` method. Note that, we import
the ``gcd`` Python module we have written at the top of this test file and
from this test method we call the ``gcd`` function within the ``gcd``
module to be tested with the each set of ``a`` and ``b`` values
``test_cases``. Once we execute the ``test_gcd`` function, we obtain the
result and compare it with the expected result as stored in the
corresponding ``test_cases`` attribute using the ``assertEqual`` method
provided by ``TestCase`` class in the ``unittest`` framework. There are
several other assertion methods supplied by the unittest framework. For a
more detailed information about this, refer to the unittest library
reference at [1]. This brings us to the end of our discussion of the
``unittest`` framework. 

nose
====

We have seen the ``unittest`` frame work and Python ``doctest`` module. .
One problem however remains. It is not easy to organize, choose and run
tests, when our code is scattered across multiple files. In the real world
scenario, this is often the case. 

In such a such a scenario, a tool which can aggregate these tests
automatically, and run them. The ``nose`` module, does precisely this, for
us. It can aggregate ``unittests`` and ``doctests`` into a collection and
run them. It also helps output the test-results and aggregate them in
various formats.

Although nose is not part of the standard Python distribution itself, it
can be very easily installed by using easy_install command as follows

::

  $ easy_install nose

Or download the nose package from [2], extracting the archive and running
the command from the extracted directory

::

  $ python setup.py install

Now we have nose up and running, but how do we use it? We shall use the
``nosetests``  command provided by the ``nose`` module, in the top-level
directory of our code. 

::

  $ nosetests

That's all! ``nose`` will now automatically pick all the tests in all the
directories and subdirectories in our code base and execute them.

However if we want to execute specific tests we can pass the test file
names or the directories as arguments to nosetests command. For a detailed
explanation about this, refer to [3]

Conclusion
==========

Now we have all the trappings we want to write state-of-the art tests. To
emphasize the same point again, any code which was written before writing
the test and the testcases in hand is flawed by design. So it is
recommended to follow the three step approach while writing code for any
project as below:

  1. Write failing tests with testcases in hand.
  2. Write the code to pass the tests.
  3. Refactor the code for better performance.

This approach is very famously known to the software development world as
"Red-Green-Refactor" approach[4].

[0] - http://docs.python.org/library/doctest.html
[1] - http://docs.python.org/library/unittest.html
[2] - http://pypi.python.org/pypi/nose/
[3] - http://somethingaboutorange.com/mrl/projects/nose/0.11.2/usage.html
[4] - http://en.wikipedia.org/wiki/Test-driven_development

.. 
   Local Variables:
   mode: rst
   indent-tabs-mode: nil
   sentence-end-double-space: nil
   fill-column: 75
   End: