Tools: Bash, Python, Pytest
Topics: Test Driven Development
From BrowserStack
Test-Driven Development (TDD) is a software development process in which tests are written before the actual code. It ensures the code fulfills defined behaviors through short feedback cycles.
Although the above description is useful, but there are some cases where it is not enough. In particular:
- We have found that writing test cases before any other code is impractical, it is sometimes handy to have some boilerplate code beforehand.
- When there are pieces of code that call each other, it is hard to know how should we test each piece in a way that maximizes the productivity of the team.
The purpose of this Lab is to help you understand how to apply TDD when building real and useful software, by (partially) build some sample applications that a lot of people is familiar with already.
Below you will find some tasks, where each represents a piece of real software to build. Each will be useful to highlight some particular aspect of the practice of TDD that we believe it is useful for you to keep in mind.
NOTE: the lab is heavily dependant on a mentor, because it is hard to come up with good automatic feedback.
Very few people in the world will need to build a JSON parser from scratch; in fact, a language like python have it built in already. However; most people with a bare minimum of programming experience have needed to use one, it is very likely that you are familiar with it. It is useful to use it as an example, because the requirements we will need to satisfy are known to a good extend by most people.
Specifically, we will build a command line tool that takes a file as a parameter and:
- Returns
0if it is a valid JSON file. - Returns
1if it is an invalid JSON file. - Returns
2if the file doesn't exist.
You are provided with some boilerplate code inside task 1, your task is to build the tool to satisfy the above requirements. From. the boilerplate, you will notice:
- There is a
_clifunction insrc/main.py, which handles the interaction with the terminal. It defines the file parameter, calls themainfunction with the file as argument and exists the program with the exit code provided by themainfunction. - There is already a test case for one of the requirements of the system: returning 2 if the file doesn't exist.
Notice how we the _cli code was written before any test for it; moreover, we don't plan to write tests for it. This is because:
- There is not much value in doing so, what it does is straight forward and we decided to trust completely its functionality (makes sense only for extremely simple code).
- Writing tests that interact with the command line is not trivial in Python (in Rust is straight forward), it may make them run slower, which affects productivity.
- The core of functionality of our tool will be in the
mainfunction and that is what we will be working with.
You goal then is to add code to tests/test_main.py and src/main.py so that it satisfy all of the requirements. One of the nice qualities of this task is that you can add the functionality incrementally, which is a critical aspect of the TDD approach. Now do the rest of the code on your own, follow Build Your Own JSON Parser as a guide to add complexity incrementally. Feel free to add any additional modules you think you need, but you can add all test cases from to test_main.py using pytest.mark.paramtrize. Try to implement everything from scratch!
In the example above we wrote all test cases to to the same function. That highlights an approach that is great for productivity: if we only test the API, we can change the implementation any time we want and we wouldn't need to modify the tests. However, it is sometimes harder to think of the edge cases of our solutions in this case. We will see one case here.
We will build a compression tool based on a Huffman Tree. We will again build a command line tool, that takes a file as input, but this time produces another file as output, together with some exit codes. The process to achieve this is as follows:
- Read the input file.
- Count the number of occurrences of each character in the file.
- Build a huffman tree based on the character counts.
- Build a table to map each character to its corresponding code.
- Build a header that can help you restore the huffman tree.
- Serialize the contents of the file in #1.
- Put the header and the output from #6 into a new file.
We won't do all of it here. You will be given some code and your job is to add missing test cases and implementation. In particular, we will be focusing on step #3.