diff options
Diffstat (limited to 'lecture_notes/test_driven_development/tdd.rst')
-rw-r--r-- | lecture_notes/test_driven_development/tdd.rst | 656 |
1 files changed, 656 insertions, 0 deletions
diff --git a/lecture_notes/test_driven_development/tdd.rst b/lecture_notes/test_driven_development/tdd.rst new file mode 100644 index 0000000..6347c47 --- /dev/null +++ b/lecture_notes/test_driven_development/tdd.rst @@ -0,0 +1,656 @@ +======================= +Test Driven Development +======================= + +What is TDD? +============ + +Objectives +---------- +At the end of this section, you will be able to: + +1. Write your code using the TDD paradigm. +#. Use doctests to test your Python code. +#. Use unittests to test your Python code. +#. Use the nose module to test your code. + + +Test Driven Development (TDD), as the name suggests, is a style or method +of software development based on the idea of writing code, after writing +tests for them. As the tests are written for code, that doesn't even exist +yet, it is bound to fail. Actual code is later written, to pass the test +and later on refactored. + +The basic steps of TDD are roughly as follows - + +1. Decide upon the feature to implement and the methodology of testing it. +#. Write the tests for the feature decided upon. +#. Just write enough code, so that the test can be run, but it fails. +#. Improve the code, to just pass the test and at the same time passing all + previous tests. +#. Run the tests to see, that all of them run successfully. +#. Refactor the code we've just written -- optimize the algorithm, remove + duplication, add documentation, etc. +#. Run the tests again, to see that all the tests still pass. +#. Go back to 1. + +First "Test" +============ + +Now, that we have an overview of TDD, let's get down to writing our first +test. Let us consider a very simple program which returns the Greatest +Common Divisor (GCD) of two numbers. + +Before being able to write our test, we need to have a clear idea of the +code units our program will contain, so that we can test each of them. +Let's first define the code units that our GCD program is going to have. + +Our program is going to contain only one function called ``gcd``, which +will take in two arguments, and return a single value, which is the GCD of +the two arguments. So if we want to find out GCD of 44, 23, we call the +function, with the arguments 44 and 23. + +:: + + c = gcd(44, 23) + +c will contain the GCD of the two numbers. + +We have defined the code units in our program. So, we can go ahead and +write tests for it. + +When adopting the TDD methodology, it is important to have a clear set of +results defined for our code units i.e., for every given set of inputs as +that we wish use as a test case, we must have, before hand, the exact +outputs that are expected for those input test cases. We must not be +running around looking for outputs for our test cases after we have the +code ready, or even while we are writing the tests. + +Let one of our test cases be 48 and 64 as ``a`` and ``b``, respectively. +For this test case we know that the expected output, the GCD, is 16. Let +our second test case be 44 and 19 as ``a`` and ``b``, respectively. We know +that their GCD is 1, and that is the expected output. + +But before writing one, we should What does a test look like? What does it +do? Let's answer this question first. + +Tests are just a series of assertions which are either True or False +depending on the expected behaviour of the code and the actual behaviour. +Tests for out GCD function could be written as follows. + +:: + + tc1 = gcd(48, 64) + if tc1 != 16: + print "Failed for a=48, b=64. Expected 16. Obtained %d instead." % tc1 + exit(1) + + tc2 = gcd(44, 19) + if tc2 != 1: + print "Failed for a=44, b=19. Expected 1. Obtained %d instead." % tc2 + exit(1) + + print "All tests passed!" + +The next step is to run the tests we have written, but if we run the tests +now, the tests wouldn't even run. Why? We don't even have the function +``gcd`` defined, for it to be called by the tests. The test code doesn't +even run! What is the way out? We shall first write a very minimal +definition (or a stub) of the function ``gcd``, so that it can be called by +the tests. + +A minimal definition for the ``gcd`` function would look like this + +:: + + def gcd(a, b): + pass + +As you can see, the stub for ``gcd`` function does nothing, other than +define the required function which takes the two arguments a and b. No +action is done upon the arguments, passed to the function, yet. The +function definition is empty and just contains the ``pass`` statement. + +Let us put all these in a file and call this file ``gcd.py`` + +:: + + def gcd(a, b): + pass + + if __name__ == '__main__': + tc1 = gcd(48, 64) + if tc1 != 16: + print "Failed for a=48 and b=64. Expected 16. Obtained %d instead." % tc1 + exit(1) + + tc2 = gcd(44, 19) + if tc2 != 1: + print "Failed for a=44 and b=19. Expected 1. Obtained %d instead." % tc2 + exit(1) + + print "All tests passed!" + +Recall that, the condition ``__name__ == '__main__'`` becomes true, only +when the python script is ran directly as a stand-alone script. So any code +within this ``if`` block is executed only when the script is run as a +stand-alone script and doesn't run when it is used as a module and +imported. + +Let us run our code as a stand-alone script. + +:: + + $ python gcd.py + Traceback (most recent call last): + File "gcd.py", line 7, in <module> print "Failed for a=48 and b=64. Expected 16. Obtained %d instead." % tc1 + TypeError: %d format: a number is required, not NoneType + +We now have our tests, the code unit stub, and a failing test. The next +step is to write code, so that the test just passes. + +Let's us the algorithm given by a greek mathematician, 2300 years ago, the +Euclidean algorithm, to compute the GCD of two numbers. + +**Note**: If you are unaware of Euclidean algorithm to compute the gcd of +two numbers please refer to it on wikipedia. It has a very detailed +explanation of the algorithm and its proof of validity among other things. + +The ``gcd`` stub function can be modified to the following function, + +:: + + def gcd(a, b): + if a == 0: + return b + while b != 0: + if a > b: + a = a - b + else: + b = b - a + return a + +Now let us run our script which already has the tests written in it and see +what happens + +:: + + $ python gcd.py + All tests passed! + +Success! We managed to write code, that passes all the tests. The code we +wrote was simplistic, actually. If you take a closer look at the code you +will soon realize that the chain of subtraction operations can be replaced +by a modulo operation i.e. taking remainders of the division between the +two numbers since they are equivalent operations. The modulo operation is +far better, since it combines a lot of subtractions into one operation and +the reduction is much faster. Take a few examples, and convince yourself +that using the modulo operation is indeed faster than the subtraction +method. + +:: + + def gcd1(a, b): + while b != 0 and a != 0: + if a > b: + a = a%b + else: + b = b%a + if a == 0: + return b + else: + return a + +But we know that the modulo of ``a%b`` is always less than b. So, if the +condition ``a>b`` is True in this iteration, it will definitely be False in +the next iteration. Also note that , and in the case when ``a < b``, +``a%b`` is equal to ``a``. So, we could, essentially, get rid of the +``if-else`` block and combine a swapping operation along with the modulo +operation. + +:: + + def gcd(a, b): + while b != 0: + a, b = b, a % b + return a + +Let's run our tests again, and check that all the tests are passed, again. + +We could make one final +improvement to our ``gcd`` function, which is strictly not necessary in +terms of efficiency, but is definitely more readable and easy to +understand. We could make our function a recursive function, as shown below + +:: + + def gcd(a, b): + if b == 0: + return a + return gcd(b, a%b) + +Much shorter and sweeter! We again run all the tests, and check that they +are passed. It indeed does pass all the tests! + +But there is still one small problem. There is no way, for the users of +this function, to determine how to use it, how many arguments it takes and +what it returns, among other things. This is a handicap for the people +reading your code, as well. This is well written function, with no +documentation whatsoever. + +Let's add a docstring as shown, + +:: + + def gcd(a, b): + """Returns the Greatest Common Divisor of the two integers + passed as arguments. + + Args: + a: an integer + b: another integer + + Returns: Greatest Common Divisor of a and b + """ + if b == 0: + return a + return gcd(b, a%b) + +Now we have refactored our code enough to make it well written piece +of code. We have now successfully completed writing our first test, writing +the relevant code, ensuring that the tests are passed and refactoring the +code, until we are happy with the way it works and looks. + +Let's now build on what we have learnt so far and learn more about writing +tests and improve our methodology of writing tests and simplify the process +of writing them. + +Persistent Test Cases +--------------------- + +As already stated, tests should be predetermined and you should have your +test cases and their outputs ready, even before you write the first line of +code. This test data is repeatedly used in the process of writing code. So, +it makes sense to have the test data to be stored in some persistent format +like a database, a text file, a file of specific format like XML, etc. + +Let us modify the test code for the GCD function and save our test data in +a text file. Let us decide upon the following format for the test data. + + 1. The file has multiple lines of test data. + 2. Each line in this file corresponds to a single test case. + 3. Each line consists of three comma separated values -- + + i. First two coloumns are the integers for which the GCD has to + be computed + ii. Third coloumn is the expected GCD to the preceding two + numbers. + +Now, how do we modify our tests to use this test data? We read the file +line by line and parse it to obtain the required numbers whose GCD we wish +to calculate, and the expected output. We can now call the ``gcd`` function +with the correct arguments and verify the output with the expected output. + +Let us call our data file ``gcd_testcases.dat`` + +:: + + if __name__ == '__main__': + for line in open('gcd_testcases.dat'): + values = line.split(', ') + a = int(values[0]) + b = int(values[1]) + g = int(values[2]) + + tc = gcd(a, b) + if tc != g: + print "Failed for a=%d and b=%d. Expected %d. Obtained %d instead." % (a, b, g, tc) + exit(1) + + print "All tests passed!" + +When we execute the gcd.py script again we will notice that the code passes +all the tests. + +Now, we have learnt how to write simple test cases and develop code based +on these test cases. We shall now move on to learn to use some testing +frameworks available for Python, which make some of the task easier. + +Python Testing Frameworks +========================= + +Python provides two ways to test the code we have written. One of them is +the unittest framework and the the other is the doctest module. To start +with, let us discuss the doctest module. + +doctest +~~~~~~~ + +As we have already discussed, a well written piece of code must always be +accompanied by documentation. We document functions or modules using +docstrings. In addition to writing a generic description of the function or +the module, we can also add examples of using these functions in the +interactive interpreter to the docstrings. + +Using the ``doctest`` module, we can pick up all such interactive session +examples, execute them, and determine if the documented piece of code runs, +as it was documented. + +The example below shows how to write ``doctests`` for our ``gcd`` function. + +:: + + def gcd(a, b): + """Returns the Greatest Common Divisor of the two integers + passed as arguments. + + Args: + a: an integer + b: another integer + + Returns: Greatest Common Divisor of a and b + + >>> gcd(48, 64) + 16 + >>> gcd(44, 19) + 1 + """ + if b == 0: + return a + return gcd(b, a%b) + +That is all there is, to writing a ``doctest`` in Python. Now how do we use +the ``doctest`` module to execute these tests. That too, is fairly straight +forward. All we need to do is tell the doctest module to execute, using the +``doctest.testmod`` function. + +Let us place this piece of code at the same place where we placed our tests +earlier. So putting all these together we have our ``gcd.py`` module which +looks as follows + +:: + + def gcd(a, b): + """Returns the Greatest Common Divisor of the two integers + passed as arguments. + + Args: + a: an integer + b: another integer + + Returns: Greatest Common Divisor of a and b + + >>> gcd(48, 64) + 16 + >>> gcd(44, 19) + 1 + """ + if b == 0: + return a + return gcd(b, a%b) + + if __name__ == "__main__": + import doctest + doctest.testmod() + +The ``testmod`` function automatically picks up all the docstrings that +have sample sessions from the interactive interpreter, executes them and +compares the output with the results as specified in the sample sessions. +If all the outputs match the expected outputs that have been documented, it +doesn't say anything. It complains only if the results don't match as +documented, expected results. + +When we execute this script as a stand-alone script we will get back the +prompt with no messages which means all the tests passed + +:: + + $ python gcd.py + $ + +If we further want to get a more detailed report of the tests that were +executed we can run python with -v as the command line option to the script + +:: + + $ python gcd.py -v + Trying: + gcd(48, 64) + Expecting: + 16 + ok + Trying: + gcd(44, 19) + Expecting: + 1 + ok + 1 items had no tests: + __main__ + 1 items passed all tests: + 2 tests in __main__.gcd + 2 tests in 2 items. + 2 passed and 0 failed. + Test passed. + + +**Note:** We can have the sample sessions as test cases as long as the +outputs of the test cases do not contain any blank lines. In such cases we +may have to use the exact string ``<BLANKLINE>`` + +For the sake of illustrating a failing test case, let us assume that we +made a small mistake in our code. Instead of returning ``a`` when b = 0, we +typed it as ``return b`` when b = 0. Now, all the GCDs returned will have +the value 0. The code looks as follows + +:: + + def gcd(a, b): + """Returns the Greatest Common Divisor of the two integers + passed as arguments. + + Args: + a: an integer + b: another integer + + Returns: Greatest Common Divisor of a and b + + >>> gcd(48, 64) + 16 + >>> gcd(44, 19) + 1 + """ + if b == 0: + return b + return gcd(b, a%b) + +Executing this code snippet without -v option to the script + +:: + + $ python gcd.py + ********************************************************************** + File "gcd.py", line 11, in __main__.gcd + Failed example: + gcd(48, 64) + Expected: + 16 + Got: + 0 + ********************************************************************** + File "gcd.py", line 13, in __main__.gcd + Failed example: + gcd(44, 19) + Expected: + 1 + Got: + 0 + ********************************************************************** + 1 items had failures: + 2 of 2 in __main__.gcd + ***Test Failed*** 2 failures. + +The output clearly complains that there were exactly two test cases +that failed. If we want a more verbose report we can pass -v option to +the script. + +This is all there is, to using the ``doctest`` module in Python. +``doctest`` is extremely useful when we want to test each Python function +or module individually. + +For more information about the doctest module refer to the Python library +reference on doctest[0]. + +unittest framework +~~~~~~~~~~~~~~~~~~ + +We needn't go too far ahead, to start complaining that ``doctest`` is not +sufficient to write complicated tests especially when we want to automate +our tests, write tests that need to test for more convoluted code pieces. +For such scenarios, Python provides a ``unittest`` framework. + +The ``unittest`` framework provides methods to efficiently automate tests, +easily initialize code and data for executing the specific tests, cleanly +shut them down once the tests are executed and easy ways of aggregating +tests into collections and better way of reporting the tests. + +Let us continue testing our gcd function in the Python module named +``gcd.py``. The ``unittest`` framework expects us to subclass the +``TestCase`` class in ``unittest`` module and place all our test code as +methods of this class. The name of the test methods are expected to be +started with ``test_``, so that the test runner knows which methods are to +be executed as tests. We shall use the test cases supplied by +``gcd_testcases.dat``. + +Lastly, to illustrate the way to test Python code as a module, let's create +a new file called ``test_gcd.py`` following the same convention used to +name the test methods, and place our test code in it. + +:: + + import gcd + import unittest + + class TestGcdFunction(unittest.TestCase): + + def setUp(self): + self.test_file = open('gcd_testcases.dat') + self.test_cases = [] + for line in self.test_file: + values = line.split(', ') + a = int(values[0]) + b = int(values[1]) + g = int(values[2]) + + self.test_cases.append([a, b, g]) + + def test_gcd(self): + for case in self.test_cases: + a = case[0] + b = case[1] + g = case[2] + self.assertEqual(gcd.gcd(a, b), g) + + def tearDown(self): + self.test_file.close() + del self.test_cases + + if __name__ == '__main__': + unittest.main() + +Please note that although we highly recommend writing docstrings for all +the classes, functions and modules, we have not done so, in this example to +keep it compact. Adding suitable docstrings wherever required is left to +you, as an exercise. + +It would be a waste, to read the test data file, each time we run a test +method. So, in the setUp method, we read all the test data and store it +into a list called test_cases, which is an attribute of the TestGCDFunction +class. In the tearDown method of the class we will delete this attribute to +free up the memory and close the opened file. + +Our actual test code sits in the method which begins with the name +``test_``, as stated earlier, the ``test_gcd`` method. Note that, we import +the ``gcd`` Python module we have written at the top of this test file and +from this test method we call the ``gcd`` function within the ``gcd`` +module to be tested with the each set of ``a`` and ``b`` values +``test_cases``. Once we execute the ``test_gcd`` function, we obtain the +result and compare it with the expected result as stored in the +corresponding ``test_cases`` attribute using the ``assertEqual`` method +provided by ``TestCase`` class in the ``unittest`` framework. There are +several other assertion methods supplied by the unittest framework. For a +more detailed information about this, refer to the unittest library +reference at [1]. This brings us to the end of our discussion of the +``unittest`` framework. + +nose +==== + +We have seen the ``unittest`` frame work and Python ``doctest`` module. . +One problem however remains. It is not easy to organize, choose and run +tests, when our code is scattered across multiple files. In the real world +scenario, this is often the case. + +In such a such a scenario, a tool which can aggregate these tests +automatically, and run them. The ``nose`` module, does precisely this, for +us. It can aggregate ``unittests`` and ``doctests`` into a collection and +run them. It also helps output the test-results and aggregate them in +various formats. + +Although nose is not part of the standard Python distribution itself, it +can be very easily installed by using easy_install command as follows + +:: + + $ easy_install nose + +Or download the nose package from [2], extracting the archive and running +the command from the extracted directory + +:: + + $ python setup.py install + +Now we have nose up and running, but how do we use it? We shall use the +``nosetests`` command provided by the ``nose`` module, in the top-level +directory of our code. + +:: + + $ nosetests + +That's all! ``nose`` will now automatically pick all the tests in all the +directories and subdirectories in our code base and execute them. + +However if we want to execute specific tests we can pass the test file +names or the directories as arguments to nosetests command. For a detailed +explanation about this, refer to [3] + +Conclusion +========== + +Now we have all the trappings we want to write state-of-the art tests. To +emphasize the same point again, any code which was written before writing +the test and the testcases in hand is flawed by design. So it is +recommended to follow the three step approach while writing code for any +project as below: + + 1. Write failing tests with testcases in hand. + 2. Write the code to pass the tests. + 3. Refactor the code for better performance. + +This approach is very famously known to the software development world as +"Red-Green-Refactor" approach[4]. + +[0] - http://docs.python.org/library/doctest.html +[1] - http://docs.python.org/library/unittest.html +[2] - http://pypi.python.org/pypi/nose/ +[3] - http://somethingaboutorange.com/mrl/projects/nose/0.11.2/usage.html +[4] - http://en.wikipedia.org/wiki/Test-driven_development + +.. + Local Variables: + mode: rst + indent-tabs-mode: nil + sentence-end-double-space: nil + fill-column: 75 + End: |