Process Health Metrics
Process Health Metrics are quantifiable measures used to assess the efficiency, effectiveness, and adaptability of various processes within an organisation.
Purpose
The purpose of having process health metrics is to provide leaders and teams with a clear, data-driven picture of how their processes are performing. This allows for informed decision-making and targeted improvements, ensuring that the processes continue to align with and support the organisation's goals.
Example Metrics
Metric | Description | Rationale |
---|---|---|
Evaluation Time | The time it takes from prioritising an opportunity until there is a validated solution, or the opportunity is deprioritised. | We need to encourage rapid feedback loops and not have opportunities sitting in the backlog for a long time. |
Lead Time to First Release | The time it takes to move from a validated solution to production. This is not necessarily your finished feature but the first version that you are using to get further feedback from customers. | We need to encourage teams to get the first iteration of a solution out quickly because there is still a high chance that customers don't want what we have. Do things that don't scale. |
Lead Time to Satisfied Customer | The time it takes to move from a validated solution to satisfied customer. This is the point that you are happy with the solution and are ready to move to the next opportunity. | Focusing on lead time keeps the focus on being responsive to customer needs and solving their problems before someone else does. |
Release Frequency | How often new features are released to customers. This should be individual releases and not gamed by counting multiple different teams inputting into a single large release as multiple releases. | Focusing on release frequency helps teams invest the necessary time and resources in test and release automation so they can release more frequently. |
Change Failure Rate | How often a change results in a failure in production. However, you want to ensure that you do not discourage teams to make changes. Some companies do not count a failure if the team recovers within 5 minutes. | We need to balance speed with quality. But the good news is that the teams that invest in testing and TDD increase both the speed and quality of their changes. |
Mean Time to Recovery | The average amount of time it takes to recover from a failure in production. This is the time from when the failure is introduced until the system is back to normal. | Errors will always occur. Again, we need to encourage teams to invest in automation and monitoring so that they can recover quickly. |
Anti-patterns
- Vanity Metrics: Using metrics that look good on paper but don’t actually contribute to meaningful improvements.
- Data Silos: Not sharing metrics across departments, which can hinder comprehensive understanding and improvement.
- Lack of Context: Presenting metrics without adequate explanation, making it difficult for team members to understand their relevance or take action.