Last Updated | GitHub API Documentation
Dora overviewOne measure of operational health are DORA (DevOps Research and Assessment) metrics, a (Google devised) standard by which Organizations measure operational performance in DevOps/deployment and target any desired improvements. DORA metrics comprises of 4 main measures, concerned with throughput and reliability. Throughput:
Reliability:
Each of the 4 metrics are measured from Elite to Low. More information, and write up on NAVEX' 2024 performance can be found in this document: NAVEX - DORA - 2024 Engineering Insights |
|
|
As of 12/24 - NAVEX Production deployment cadence (aka Trainstop) is monthly. DORA Reference:
NAVEX 2024 Rating: Medium > Monthly Further info: NAVEX - DORA - 2024 Engineering Insights |
|
|
Definition: The average amount of time that elapses between committing new code and releasing that code into production. Note: The chart shows our flow metrics for Median Cycle Time, based on the time a Jira ticket moves from In Progress through Done. Given our monthly deployment model, that number should not be seen as the overall metric - but rather an indicator of how long it would take us to move something to production if we were following CI/CD. To view metrics for any given team, please consult flow metrics. For Navex, given we have a two sprint, plus hardening and testing rotation, we are in the 7 week cycle range. Arguably, this puts us at medium/low per DORA. DORA Reference:
NAVEX 2024 Rating: Medium (7 week cycle) Further info: NAVEX - DORA - 2024 Engineering Insights |
|
|
Definition: Change failure rate is: Number of failed deployments / (total deployments - patches) As NAVEX fixes forward where possible during deployment vs rolling back, and generally irons out deployment issues in our stage environment this number is thankfully very low, and is taken from the number of issues arising during deployment that require invasive action. DORA Reference:
NAVEX 2024 Average: 5% Further info: NAVEX - DORA - 2024 Engineering Insights |
|
”To calculate time to restore service, you’ll need to have a shared understanding of what incidents you’re including as part of your analysis. Once you’ve done that, it’s a reasonably straightforward calculation, where you divide the total incident age (in hours) by the number of incidents.” This for NAVEX deployments is an interesting calculation, for which we don't really have accurate data. Realistically, we don't revert or live with a broken deployment. We fix-forward, either with hotfixes or configuration updates during the deployment window. On occasion, that will extend the maintenance window (for those applications currently not able to deploy without downtime), occasionally by a number of hours. Given this observation, we would put our performance for Failed Deployment Recovery Time as an unscientific High. This rating is due to our service interruptions during deployment being resolved during the window or prior to the following US business day starting. No deployment related issues in 2024 continued into the next workday - keeping us under the ‘Less than one day’ marker. This doesn't count application issues that are patched immediately, or when toggles are turned off to disable an impaired change. In 2025 we intend to track this more accurately. Most importantly, tracking time outside of the published maintenance window customers are unable to access an application after the allotted timeframe. |
  |
Definition: The average amount of time it takes to recover from a service incident during deployment DORA Reference:
NAVEX 2024 Rating: High Further info: NAVEX - DORA - 2024 Engineering Insights |
Last Updated