It is certainly in vogue to use metrics to drive management decisions. This is especially true in a distributed workforce. There can be no pretending my team is working as I stroll by their desks late into the night. In fact, my personal preference is to be remote from my employees to ensure I do not judge them by which of us leaves the office later at night.
Metrics can be pretty difficult to use. So, before you get any further into this, if you are unfamiliar with Accelerate: The Science of Lean software and Startups. You should almost certainly familiarize yourself with their data driven approach and come back to this anecdotal piece after.
If you are still here then thanks. My struggle with metrics for years in software is that most metrics appear objective and in fact are quite subjective because the target is subjective. Look at a simple metric like: Did this feature get delivered on time? Well, navigating the minefield of the quality, the actual vs expected team size, the actual vs. expected scope delivered. All open multiple cans of worms for something that on its face seems simple.
So, instead of giving up, I have a few principles that I like to apply to metrics used by my team. Then in the future we can walk through some of the metrics I use on top of those defined by Accelerate.
1. Requirements Driven
This may seem self evident. However, I have rarely seen clear documentation on exactly how one specifies a requirement before it is driving team member behavior. For example, escaped defects is a metric teams like to use. However, different executives care if those tickets are tied to support tickets or not. i.e. if the customer hasn’t found the ticket, then perhaps it’s not that severe. Perhaps a defect only counts if it is within a key workflow path. With a bit of brainstorming there are probably even more facets we should at least consider.
None of these specific facets are complicated to implement, yet if we do not think through them we will not be speaking the same language. This is extremely important in all teams and even more so in a distributed environment. Silveira in Building and Managing High-Performance Distributed Teams: Navigating the Future of Work talks about an explicit North Star so that team members can make decisions alone because they understand the direction the company is taking. This metric example may seem too small, but not writing down the exact specification for a metric can either lead your team astray or just as bad accidentally lead you in the right direction which will reinforce this incorrect behavior.
We expect detailed docs to write code for our customers, it is reasonable to request this of the metrics we are measured by. In fact, opportunities arise when we treat these at first order features.
2. Produced via code
We are measuring software engineers, engineers generally write code to make money. Our metrics should be built via the tools to which we are accustomed. If I want to measure deployments or MTTR, then I should be able to view a report in Excel or Google Docs, or my favorite Airtable and see real time data about how things are going.
Here, we are standing on the shoulders of principle 1. If there are not requirements for how to calculate a metric, then we will be unable to manifest that requirement in code. Further, if we find that even with good requirements the data doesn’t exist cleanly to produce the metric. Then we may need to think more deeply about our processes because there is likely a misalignment. This is great news. Finding out in December I missed my bonus because of bad or missing data is way less fun than finding this out in January when goal setting is occurring. Then we can determine if spending the time and energy on a new metric is worth it, because these aren’t free and will likely displace some customer work.
Like we ended section 1, this is not meant to deride data or sound like this is insurmountable. But wrong data is probably worse than no data and respecting that engineers and managers all have only so much time in a week goes a long way towards building a culture of sustainably delivering high quality business value.
3. Balanced
Balancing metrics happens across 2 dimensions. The first is attempting to measure countervailing forces to ensure we are not swinging too hard to the side. For example if features are all being delivered on time, but Customer Satisfaction (CSAT) is down and defects are up relative to a baseline… Then perhaps we are not sufficiently focused on delivering a whole package of value to our customers: Software that is both regularly updated and high quality.
The second dimension is considering both absolute and relative changes. Let’s presume for a minute that I look over 2019 and 2020 escaped defect counts and I see that there is a 50% increase in escaped defects. All things being equal that could be an indication of a problem. However, if we also consider that the team size grew by 100% over the same time period, this could actually be a great number, as we are seeing new developers entering the system reducing our rate of defects per engineer.
Balancing metrics is definitely more art than science. It also has a tendency to push us towards many different metrics. I would suggest that a general overview can probably be done with a single digit number of metrics, this can reasonably be kept in everyone’s heads and can capture quite a bit of context.
4. Visible to all
Metrics should be placed where those they are measuring can see them. In fact, not only should the dashboards exist in an appropriate location. (Not necessarily a TV hanging over everyone’s heads, but at least on confluence or somewhere clear) But referring back to principle 2, the code should be open to the team as well. These are highly paid knowledge workers and as you start coding up metrics, you may find, as I have on many occasions that bugs can be introduced. This openness also aligns team members to the North Star, and does it speaking the programmers’ language: code.
Which leads us to the last principle.
5. Continuously Improved.
Metrics set about to codify the goals of the organization. Those goals can change, the measures we set out to hold ourselves accountable to may not actually drive the behavior we desire. They can even be buggy. We should expect the people to whom these goals are being applied should be able to suggest updates, make those updates, and see how their hard work is in fact ‘moving the numbers up and to the right.’
Closing Out
Metrics are an opportunity to drive trust and empowerment within our organizations. Often times they are used to limit bonuses or shepherd employees onto performance improvement plans. Just because they are often used poorly does not mean we as leaders cannot turn the ship a bit towards success. It is very likely the case that your organization is predominantly doing a great job. Do your numbers expose this? Do they provide opportunities to give praise to your teams and team members doing the great work that is keeping you employed? Further, are you able to foster feedback from your team that some metric is leading us away from our North star? When that happens you will know that you have fostered the trust and collaboration that not only drives grow with in an organization, but serves a key countervailing force: It doesn’t have to suck at work.
Take care.