There’s probably no other topic that garners as much discussion as the topic of software engineering metrics. Which ones to measure? Why? How? I find the topic very interesting and over the years have honed in on a structured approach to try and measure metrics that actually matter.
The structure I will present here will cover the metrics I prioritize. Each metric has a few qualitative traits: frequency and audience. Frequency is concerned with the cadence of collecting and assessing these metrics. The frequency could be hourly, daily, weekly or sporadic. Audience is concerned with the consumer of this metric. There are three main consumers of the metrics I propose here. The first is the engineering team and its leadership. The second is business partners. These could be marketing, finance, sales, the CEO and so on. The third is the board of directors. For each metric I propose, I will identify its frequency and its consumers. Metrics can have more than one consumer.
My favorites :)
Customer escalations: This is my favorite metric as it addresses two concerns: quality/testing and customer satisfaction. A customer reported bug is almost always a signal in some (internal) testing gap. Therefore, tracking the rate of those escalations - or more generally customer reporting bug - alongside other attributes like area of the product and severity over time enables you to plug holes in your tests. In an ideal world, customers never report bugs and all bugs are caught before you release new software. That’s impractical, but one can try. This data helps with this effort.
Frequency: Weekly/Monthly. Audience: Engineering Team & Business
Frequency of releases: My second favorite metric as it is simple to measure and very actionable. At its heart this metric is trying to measure how often you release new code/features to production/customers and your ability to either maintain or increase the frequency of your releases. I’m a big fan of the mantra to release early and often. This metric might not be applicable for on-premises software or companies who release infrequently, on the order of a few times a year.
Frequency: Daily. Audience: Engineering Team
Test pass/fail ratio: Fairly obvious one. Testing should be a crucial part of our CI/CD pipeline. Before you push any code out to the world, you ought to make sure that you have adequate coverage of said code. This metric helps you evaluate the health of your builds and changes to your code base. It is also very actionable. A test failure should result in an immediate fix by a member of the engineering team. If you have, you should also track your test code coverage. You want high test coverage on your code and green builds.
Frequency: Continuous/Daily. Audience: Engineering Team
Uptime/Downtime: You should be measuring this one on a continuous basis. This means that you should measure your downtime in response to an operational incident e.g. service down. Additionally, every incident must be followed by a no-blame retrospective to foster a culture of learning and improving operational resiliency.
Frequency: Sporadic. When an incident occurs. Audience: Engineering Team & Business
Operational costs: This metric is highly applicable to SaaS companies. It covers the cost of running your SaaS - the infrastructure costs.
Those have an impact on your company’s profitability and margins. Other metrics to track here might be dependent on your finance team but could include capitalization of R&D costs and the variance of your actual spend relative to your budget.
Frequency: Monthly-Quarterly. Audience: Business and Board.
Not exactly metrics, but you should know them
Roadmap planned vs actual: Your teams are busy working on new features. You need to surface how your teams are tracking relative to roadmap plans and share that with your peers and the business. Your marketing team might be making assumptions on the availability of a feature to coincide with a big marketing event. Similarly, your sales team might be counting on the availability of a new SKU or product at the start of their quarter. You should share this information with additional context too. And you should especially do that if some work is not trending well relative to plans.
Frequency: Monthly-Quarterly. Audience: Engineering Team, Business and Board. The Board doesn’t need the most intimate details, a quarterly summary will suffice.
Productivity surveys: You should run regular, monthly or quarterly, surveys to track the health and pulse of your organization. Your People team might take the lead on some of these surveys e.g. engagement. There is however, a survey that you should own that is idiosyncratic to the engineering team. The scope of the survey is to surface obstacles the team is facing. In my experience these will oftentimes be long build time, local dev environment (laptops), flaky tests, obtuse code that needs refactoring or is a source of customer problems, lengthy code reviews and so on.
If you have a Platform Team, I suggest that they own this and surface these areas of friction and then work on them.
Frequency: Monthly-Quarterly. Audience: Engineering Team
Stoppages: At an abstract level, software development teams are similar to a manufacturing production line. A new feature is spec-ed, the engineers and product managers work on building it and then move on the next and so on. This is obviously an oversimplification, but it will help surface a common source of stoppage that I have encountered. This stoppage, which can affect an entire engineering team, will likely occur at the completion of the team’s current mission and beginning of the next. You want to minimize the idle time during this transition. This idleness, or stoppage, can arise if the team has no new identified mission (PM hasn’t surfaced what’s next) or they are missing resources (UX specs, engineers with specialized skills) to effectively work on their new mission. Every transition will incur some downtime, that is expected and encouraged, but a prolonged complete stoppage is to be avoided.
Frequency: As needed. Audience: Engineering Team
Anti-patterns and final thoughts
There are many metrics that you can collect. I’d recommend that you start with a very few. Pick ones that are actionable, easy for you to collect and think about who, and how often, you will share them with. Make sure you cover the three personas I mentioned earlier: you/your team, the business and the board. I rely on a handful of metrics and package them with some slight differences to suite these different audiences.
And now for a few anti-patterns!
First, don’t surface metrics that are vanity, difficult to obtain or are of an internal concern to external stakeholders. If you show your CEO tickets resolved over time once, they might (trust me, they will) ask for that on a continuous basis and worse assume that it is a valid metric.
Second, Never tie metrics, unless absolutely necessary, to compensation or performance evaluation. You will get unpleasant surprises. You measure to observe, hypothesize on how to optimize and iterate.
“Show me the incentive, I'll show you the outcome” Charlie Munger
Third, don’t share metrics outside of your team without a story and context. The further away the audience is from the day to day running of engineering, the more context they need.
They are good too, especially the deployment frequency and change failure rate.
What is your view on DORA metrics?