Every published Lead Time benchmark is wrong for SMBs.
The DORA State of DevOps report, the Accelerate book, the LinearB benchmarks — they’re all aggregated across organizations from 5-engineer startups to 5,000-engineer enterprises. Their “Elite under 1 hour” classification is meaningless when your team is 12 people, you have one production service, and your founder still reviews every PR touching billing.
This page publishes the Lead Time numbers we actually see in 5–50 person engineering teams. Source: 30+ SMB clients, anonymized, 2024–2026. Methodology: first-commit timestamp to merge-to-main timestamp, median rolling 30-day window.
What Lead Time Actually Measures#
Lead Time for Changes is one of the four DORA metrics. The DORA team’s definition:
The time it takes to go from code committed to code successfully running in production.
In practice, every team measures it slightly differently. Here’s our convention — and why:
- Start: First commit on the feature branch (not PR open, not “ticket assigned”).
- End: Merge to main, assuming CD ships main automatically. If you don’t have CD, end at deploy timestamp.
- Aggregation: P50 and P85 over a rolling 30-day window.
Why first commit and not PR open? Because the gap between first commit and PR open often hides the real bottleneck — engineers waiting on environment setup, database migrations, or unclear requirements. Starting the clock at PR open hides that drag.
SMB Lead Time Benchmarks by Team Size#
From 30+ teams, anonymized:
| Team size | P25 (best) | P50 (median) | P75 (slowest) |
|---|---|---|---|
| 5–10 engineers | 8 hours | 1.5 days | 4 days |
| 11–25 engineers | 1 day | 2.5 days | 6 days |
| 26–50 engineers | 2 days | 4 days | 9 days |
Three observations from this data:
- Lead Time grows with team size. This is not a sign of declining team quality — it’s the natural consequence of more code, more reviewers, more dependencies. Don’t punish the team for it.
- The P25 to P75 spread widens with team size. Larger teams have more variance because they have more kinds of work (small bug fix vs. cross-team migration).
- The median P50 for a 25-person team is slower than the worst P75 for a 10-person team. When founders compare their 25-person team to their memory of “back when we shipped fast,” they’re comparing apples to a different fruit.
Benchmarks by Vertical#
Same data, sliced by industry:
| Vertical | P50 Lead Time | What drives it |
|---|---|---|
| Pure SaaS B2B | 1.5 days | Fast review culture, mature CI |
| Fintech | 4 days | Compliance review, dual-control deploys |
| Healthtech | 5 days | HIPAA audit gates |
| Agency / Consultancy | 6 days | Client approval gates per change |
| AI/ML platform | 3 days | Long CI runs (model evaluation) |
If you’re at the median for your vertical, you’re not slow — you’re normal. Vertical context dominates team-process context.
What Actually Affects Lead Time#
In our 30-team sample, four factors explain ~80% of Lead Time variance.
1. Median PR Review Wait Time#
The single highest-leverage variable. Cutting median review wait from 24 hours to 4 hours typically cuts Lead Time by 40–50%. The interventions that work:
- Async review SLAs (e.g., “PRs older than 4 working hours get an
@hereping in#review-please”). - Smaller PRs (target under 200 changed lines).
- Reviewer rotation that doesn’t bottleneck on one senior engineer.
2. CI Pipeline Duration#
If your CI takes 25 minutes, Lead Time has a 25-minute floor. Median pipeline durations we see:
- Pure SaaS: 4–8 minutes
- Monolith with full test suite: 12–20 minutes
- AI/ML with eval gates: 30–90 minutes
Below 5 minutes you stop getting velocity gains and start sacrificing test coverage. Don’t chase sub-3-minute CI for its own sake.
3. Merge Strategy#
Trunk-based development consistently beats long-lived feature branches by 30–50% on Lead Time. The reason isn’t ideological — it’s that long branches force big PRs, big PRs delay review, delayed review extends Lead Time. The arrows point one way.
4. Number of Reviewers Required#
One required reviewer is the SMB sweet spot. Two doubles wait time without doubling quality. We have data on this — a comparison study with two of our clients showed CFR was identical between 1- and 2-reviewer policies, but Lead Time was 1.7× longer with two.
How to Compare Yourself Honestly#
Run this check on your own data before drawing conclusions:
- Compute your rolling 30-day P50 and P85 (not the average — see FAQ).
- Filter out PRs with zero non-CI commits (auto-merges, dependabot, formatting fixes).
- Look at the P50/P85 ratio. Healthy is 3–4×. If P85 is 10× your P50, you have a long-tail problem (probably one specific kind of work that’s silently broken).
Common Analysis Pitfalls#
- Comparing P50 to last quarter without checking sample size. A team that merged 20 PRs last month has noisy data; a team that merged 200 has stable data.
- Treating Lead Time spikes as failures. A spike during a security audit, holiday week, or major migration is expected. Annotate the dashboard, don’t act on it.
- Optimizing Lead Time without watching CFR. Faster shipping at higher failure rate is not a win. The two metrics constrain each other; that’s the whole point of measuring both.
Next: Reduce CFR or Pick a Framework#
Once you understand your Lead Time number, the next two reads in this pillar:
- How to Reduce Change Failure Rate Without Slowing Delivery — the partner metric to Lead Time.
- DORA vs SPACE Framework: Which Fits an SMB Better? — when DORA stops being enough.
Or if you haven’t yet instrumented anything, start with How to Implement DORA Metrics in a 10-Person Team.