Doctor, what’s wrong with my software?

doctor softwareDelivering product with the right quality, on time, can be difficult. Being able to take good testing decisions timely is crucial to reach this. Are there ways to improve your test decision capabilities? Yes there are, people are using them and they have shown to be valuable.  So let’s take a look at some possibilities to enhance the way you manage your testing, and the quality of your products.

When decisions have to be made, sometimes a conflict exists between the time when the decision has to made, and the information that is available at that time. It can take too long before all the information is gathered. You can miss opportunities and decision options can expire. So the question arises: Can you take decisions when you only have partial information?

How do doctors decide

Let’s look at how doctors take decisions based upon examinations of patients, and see if we could use a similar approach for decisions on how much and what kind of testing is needed? Watching House on TV, I was triggered by the way that doctors decide. They follow a kind of iterative method: diagnose the patient, do a test or give them a specific treatment and see how it works, and if that does not work then try another. It sounds a bit like a continuous Plan-Do-Check-Act or and agile inspect and adapt approach, so probably this could work also for testing decisions.

The order of the tests and treatments is usually based on the chance that it will address the cause of the disease or provides information that helps doctors to cure the patient. Every test and response combination gives them information that helps them to exclude deceases and to come closer to answering what is wrong with the patient and treat them in the best way they can.

Why do they work like this? Firstly because running all the tests on day one is simply not possible, it would take too much time, be too costly, and the patient wouldn’t like so many examinations (it could even kill him or her along the way). But another reason that they do this is simply because they can do it like this. Examination methods and medicines have been researched, developed and tested, and the evidence has been documented for usage in day to day work by professional skilled doctors and nurses. They know the chance that a test has of determining the cause of the illness, and the chance that a certain treatment will cure it, based upon clinical tests.

Taking testing decisions

So how does how doctors decide compare to decision taking when managing testing? For instance, we could test some features, and depending on the outcome decide if further testing is needed. But, are we able to work this way? It would require that we know how good a set of tests covers the functionality of the product, to be able to decide if further testing would be useful. Would further testing  help us by finding defects, or would the test tell that or product has sufficient quality and can be shipped? Only in those cases we would normally decide to continue testing.

But often products are very complex and the number of possible test cases is way too big to be able to know the quality of a product with only a limited set of test cases. Risk based testing methods can help to limit the amount of testing, resulting in a risk that is getting smaller after each test is done; but it remains very difficult to know what the risks of failure are, and how big those risks are. So the information that we can get from testing is often limited, insufficiently quantified and has a large uncertainty; this information doesn’t really help us to take decisions on what to test and when to stop testing.

In my opinion, there is also a fundamental difference between curing patients and developing and testing software. The first one can be considered production while the second one is invention. Given that statistically patients are more or less the same, methods have been developed for treatments that can be done by trained professionals and will give a defined result with a specified certainty.

Software products don’t have this kind of similarities, and methods (whether or not used by trained professionals) provide no guarantee that the product will have a certain level of quality. That also explains why the information that software developers and testers received does not give them the same level of confidence as doctors do. By the way, in construction there are physical laws that, similar as in medicine, have been researched and tested. So construction engineers are able to design a system and to know, based upon the available methods and evidence, the chance that a building will survive an earthquake or a flood.

Does this mean that we can’t properly take testing decision? Unfortunately, most of the time I think that’s the case. Although we do have insights in how testing can drive the quality of  products, it’s difficult to make hard decisions and what (not to) test and when to stop testing. We need to find a way to quantify the quality.

Measuring quality

Testing normally results in defects that are found. Question arises if information on the number of defects that have been found can help you to know the quality of a product, and to take decisions on further testing?

To support software release decisions, so called software reliability growth models have been developed. These models apply mathematical formulas or curve fitting to predict how many defects a product contains, based upon testing results. Although some of these models have been tested and calibrated with larger data sets, the accuracy of these models is often very low, until late in the project. At that stage in a project, the information doesn’t really help you, since the only possible decision that you can take (which is to postpone the release and do additional testing and problem solving) is not really satisfying. So my opinion is that software reliability growth models do not really help you to make testing and/or release decisions.

I have used a defect estimation model called the Project Defect Model to measure and control the quality of products. This model helps to plan and track quality, by estimating how many defects will be made, and where they will found, and continuously track and steer upon defect information from reviews and testing. It has proven to be a valuable instrument to define and agree the required product quality, and steer projects and teams in delivering the right quality, on time. You can also use the Project Defect Model to manage product quality in agile teams.

There are other mathematical techniques, based upon Bayesian Belief Networks, which provide more value; remind me to write a posting on that in the future. The use of defect models is still vary limited, the reason I hear most is that they are too difficult to deploy and cost too much time and money to implement. I don’t agree with this. Of course they require an initial investment in training and coaching, but there is sufficient evidence that it is economical to invest in quality. Looking at the costs of poor quality products, and project delays due to late discovery of defects, the use of a defect model can be gained back within the first project.

Agile iterations

Can information from testing a system help you to take better test decisions? Not if you are testing a compete system that has been developed using a waterfall scenario, but if you are developing a product iteratively (e.g. using Agile methods, design and test iteratively), then it can certainly help.

If you define a user story with specific functionality, and an acceptance test to validate that functionality, then running the test will provide you with evidence that the user story is working (“done” as they call it in Agile). If you use Test Driven Development, and write your test case before writing the software, then you know by executing the test that the functionality is working. During the iteration this will provide the team information which functions are implemented correctly, and after acceptance in the demo meeting you know that you have delivered working software.

Data gathered from iterations on the number of defects can help you to plan and track quality, and to improve the way you develop your software, e.g. with Root Cause Analysis. But still, the “method” doesn’t guarantee quality; you will have to design every specific test case, which is still a kind of inventing. And the information that one function or user story is working still doesn’t give you any information about the quality of the software that you will develop next.


My conclusion is that, when looking at testing, taking decisions with limited information is certainly possible. A best practice is to develop and test software in small iterations, estimate and track the number of defects for each iteration, and to use Test Driven Design and Risk Based Testing approaches to maximize testing results. A measurement tool like the Project Defect Model can help you to know the current and expected quality of your product, and to decide when to take action. This way of working provide your customers with working software, and gives the possibility to release products whenever sufficient functionality with the right quality is ready.

(this post was published on Dec 4 2010 and fully revised and updated on Jul 9 2014).

Ben Linders

I help organizations with effective software development and management practices. Active member of several networks on Agile, Lean and Quality, and a frequent speaker and writer.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.