A View of Metrics and Measurement
In the world of software engineering the gathering of metrics is an important part of learning about and improving on how we work. We can use metrics to give us all sorts of information, how many bugs were created, what parts of the platform had the most bugs, how long a feature takes to get to completion, etc.
The important thing to remember is that we need to let the metrics tell us the story, to listen to the metrics and act upon what they tell us. However, many software development teams fall into the trap of “managing by metrics” in an attempt to make the metrics look good, usually at the expense of doing what’s best for the overall engineering effort.
Let’s use an injury as our analogy.
The patient (i.e., the software platform) is telling us their symptoms (i.e., the metrics). This often indicates a larger injury (i.e., platform design, architecture fault, etc.), which in turn generates more questions: What caused this injury? How do we properly treat the injury? How do we avoid future occurrences of the injury?
Too often we focus on the symptoms (metrics, such as the number of bugs) and devise strategies to make the number of symptoms go down. Does this treat the injury as a whole?
Addressing the symptoms without treating the larger injury is like putting a band-aid on a hatchet wound. Yes, if you put enough band-aids on the wound, it may stop the bleeding and might even heal it. However, as we all know, band-aids eventually get old and fall off, requiring new band-aids to be applied. Rather than band-aids, shouldn’t we stitch up the wound and fix it for good?
In this scenario, what we have been trying to move away from is letting the raw numbers tell us whether we are doing well or poorly. They are symptoms of a larger problem. We treat the larger problem and continue to measure to see if our treatments are helping, hindering or keeping us about the same. We use the numbers to advise us on how we’re doing at treating the larger problem, but we do not let the numbers become something we are fixated on.
We can do a measure, such as counting the number of bugs over a certain time period, and we will see fluctuations. Why? Multiple reasons, but here are a few:
- New features being introduced are more complex than first realized
- New features being introduced require work in other areas that are being impacted
- Fixing older bugs often reveals additional work, usually because we’ve been creating multiple patches (band-aids) to simply mask a problem, rather than fixing the root
- Changing requirements lead to rework, and our teams track this rework by writing…bugs.
Understanding just these handful of reasons for a fluctuating bug count helps explain why setting an arbitrary goal like “reduce bugs 10%” actually does more harm than good. Those four bullet points are roots of a problem and should be addressed, preferably before they can happen. Each team will encounter their own roots and figure out how best to address them, which is where that solution should lie.
What often happens when attempting to address the raw number of bugs is the solutions we come up with are almost exclusively at the end of the development pipeline. We see suggestions such as:
- QA should note to check this
- We should add this to automation
- We need more checklists or documentation to help with the testing
- We just need more testing
None of the above are bad per se, but they aren’t a solution to the larger development problem; they are band-aids thrown on at the end of the development pipeline.
As we address the larger problem, shouldn’t bug counts go down? In theory, yes, but that depends on how many larger problems actually exist. Where are the holes in the process that are allowing these bugs to continue? Are the bugs we’re seeing the same type that we saw six months ago? What was the reason for a particularly buggy feature? A report and measures can’t answer these questions for you, they merely give you the information to start to investigate them.
How long before we actually start to see improvement? That depends on how many total root causes need to be addressed, what steps we are taking to do so, what data collection changes we may need to make, and continuing to measure to get the most up-to-date information. If your software platform is very large and complex, this could take quite a bit of time.
Should we continue to collect metrics? Yes. Absolutely, yes. The collecting of the metrics is not a problem. It’s how we use and react to them that can be problematic. Software engineering, like health care, often requires a greater period of data collection and evaluation before the actual problem itself can be addressed. We should use the metrics to help us manage all aspects of the development process, rather than focusing solely on the metrics themselves.