The Illusion of Safety: Thoughts on 100% Test Coverage
Test coverage is only a usable metric to know something is missing, if it is less than 100% but does not prove anything if it is 100%.
Join the DZone community and get the full member experience.
Join For Free"...this is a properly tested project. We even have 85% test coverage!" told John, a coder, proudly to his colleague, Mike, whom he just met next to the coffee machine after not seeing each other for a while. Mike was not really satisfied and asked: "And what about the remaining 15%?"
John stated that the part of the code in question is not really important, partially not even used anymore. And anyway, it was tested manually. And, of course, they keep adding new and new classes to the Sonar-Exclude in order to hide code, which can not be tested, from SonarQube.
"We have different understanding on 'properly tested,' " concluded Mike. He pointed out that John's project has probably way lower coverage at 85%. They are just blending out a — probably continuously increasing — part of the code. Also, even if they would not do that, 85% covered means that 15% might be already broken, but based on the tests, nobody will notice. As well as that the team is either amateur or ignorant if they believe that their code really can not be tested.
The Story of Adam, Bill, and Their Tests
A Simple Task
On a shiny day, Adam got a new task: he had to develop a code that reads out the already stored invoicing data from the database, generates a nice PDF invoice, and sends it to the front end, where it will be presented to the user. A front end colleague is already working on his part, so Adam only has to implement the following interface:
interface PDFInvoiceProvider {
public PDF fetchPDFInvoice(Long userSessionId);
}
Soon, Adam committed the following code:
interface InvoiceDB {
public Invoice getInvoiceById(Long id);
}
class InvoiceDBImpl implements InvoiceDB {...}
interface InvoiceRenderer {
public PDF renderInvoice(Invoice i);
}
class InvoiceRendererImpl implements InvoiceRenderer {...}
class PDFInvoiceProviderImpl implements PDFInvoiceProvider {
private InvoiceDB invoiceDB;
private InvoiceRenderer invoiceRenderer;
public PDF fetchPDFInvoice(Long invoiceId) {
final Invoice invoice = invoiceDB.getInvoiceById(invoiceId);
return invoiceRenderer.renderInvoice(invoice);
}
}
Adam was proud of his code. Different steps are decoupled, the scope of the classes and methods are clear, and most importantly, he wrote a lot of unit tests for all three classes he created and actually has 100% test coverage. Adam is happy that he could finish it on his last day before a well-deserved vacation.
A Small Addition
Later on, the project leader assigns Bill the task of extending Adam's code to not only push the created PDF to the front end but also save it in an object store, which the company already uses to store key-value pairs. Bill changed the code as follows:
class PDFInvoiceProviderImpl implements PDFInvoiceProvider {
private InvoiceDB invoiceDB;
private InvoiceRenderer invoiceRenderer;
private ObjectStore objectStore;
public PDF fetchPDFInvoice(Long invoiceId) {
final Invoice invoice = invoiceDB.getInvoiceById(invoiceId);
final PDF invoicePDF = invoiceRenderer.renderInvoice(invoice);
objectStore.saveItem(invoicePDF, "Invoice" + invoiceId + ".pdf");
return invoicePDF;
}
}
Bill felt like he was winning the lottery: the actual implementation of ObjectStore
interface was already written by somebody. He just had to use it. Also, he did not have to write new unit tests; he just had to modify the already existing tests to initiate networkStorage
, and his new code also reached 100% of coverage.
Perfect Testing
In the test environment (where the object store is an in-memory dummy, only saves the last object pushed into it, and login is also mocked to use always one and only session with ID 1, also containing only one example invoice), the system was tested manually and worked just as expected. Successful manual testing and 100% test coverage. Everyone had a good feeling about the new feature.
The Dream Scatters
And it went live. And it worked quite differently than expected. Users got the invoices of random other users.
Developers had to stay in late and fix the issue, and pressure was rising... The next day, legal and PR departments got involved and asked the question of exactly what sensitive data was given away. The team had a really hard time answering this seemingly simple question.
Soon, the top management tasked Mike to investigate what went wrong and let the developers learn as much from the events as possible. After a couple of days of reading the code and git history, Mike presented the following points:
- Adam's original code was well unit-tested. Every method did what Adam was expecting from it. Still, Adam has not noticed that the ID that frontend is passing to his central method is not the ID of the invoice but the ID of the user session. Both IDs are represented as Longs, the code compiled, and each test was only focused on a single method. There was not a single test that would focus on how the methods of different classes interact with each other.
- Although Bill's code was executed by Adam's tests, but was not tested in terms of not asserting what it is doing. There was not a single test that failed, even though Bill swapped the parameters
saveItem
method, the PDF was used as a key, and the file name as a value in the object store (which is capable of saving Object-Object pairs, too). - The whole team fell into the false illusion of being safe just because the code coverage of unit tests was 100%.
Lessons To Keep In Mind
With the story above, I wanted to demonstrate a couple of points that should be obvious. Most of the developers I know even know them, but somehow, way too often, this knowledge remains theory:
- 100% test coverage is not a goal. Your project is not tested properly just because each line of code was executed during the tests. Also, the goal is not to execute each line but to assert that the lines are actually doing what they are intended to do.
- Less than 100% coverage means you either have dead code (which is never executed) or you have code that is executed, and you are not ensuring that it does what you are expecting from it.
- The commonly repeated excuse "it can not be tested" is actually true only in a small fraction of the cases. In reality, almost everything can be tested. Test, being hard to write, is usually a sign of bad code. In other words, rejecting this excuse and making your way to test the code usually leads not only to higher coverage but to better-written code, too – as you might improve the quality of the code in order to enable yourself to write tests for it.
- Regardless of how well unit tests are written, they are not providing you safety from the bugs based on how your units collaborate. Think about all your tests (not only unit tests) as a parachute. Unit tests ensure that every rope and string is capable of bearing the weight it shall bear. Do you really want to jump with a parachute that was only tested as a set of single units and nobody ever tested that it is assembled correctly? If the answer is yes, feel free not to write other tests with a larger scope. If the answer is no, please start studying what is beyond unit testing.
Never Ending Story
On a side note, I would also not remain silent about the famous quote of Edsger Dijkstra:
Program testing can be used to show the presence of bugs, but never to show their absence!
To be practical, you can not really prove that a code is entirely correct by automated testing. Still, you can close out a lot of bugs by having meaningful tests covering all your code, not only testing each unit individually but your software as a whole.
I am aware of a lot of circumstances (tight deadlines, lack of experience, pressure to deliver business features faster, bad project leaders, unclear business expectations, etc.) that may lead us not to apply the points above to real-life projects. Still, failing today is not an excuse, not even to try tomorrow.
Opinions expressed by DZone contributors are their own.
Comments