Should we measure Lead Time from first commit or from PR open?

First commit. Measuring from PR open hides the work that happens before review starts and biases the metric toward 'CI is fast' rather than 'changes ship fast', which is what DORA defines.

Why use median (P50) instead of mean?

Lead Time is right-skewed by long-running PRs. The mean gets dragged up by a handful of outliers and stops reflecting typical experience. P50 plus P85 gives you both 'typical' and 'long tail'.

How long should we collect data before drawing conclusions?

Eight weeks minimum. With a 10-person team merging around 50 PRs per month, that gives you 100 data points — enough for stable P50/P85, not enough to over-fit to short-term noise.

Lead Time for Changes: Benchmarks for 5-50 Person Companies

Every published Lead Time benchmark is wrong for SMBs.

The DORA State of DevOps report, the Accelerate book, the LinearB benchmarks — they’re all aggregated across organizations from 5-engineer startups to 5,000-engineer enterprises. Their “Elite under 1 hour” classification is meaningless when your team is 12 people, you have one production service, and your founder still reviews every PR touching billing.

This page publishes the Lead Time numbers we actually see in 5–50 person engineering teams. Source: 30+ SMB clients, anonymized, 2024–2026. Methodology: first-commit timestamp to merge-to-main timestamp, median rolling 30-day window.

What Lead Time Actually Measures#

Lead Time for Changes is one of the four DORA metrics. The DORA team’s definition:

The time it takes to go from code committed to code successfully running in production.

In practice, every team measures it slightly differently. Here’s our convention — and why:

Start: First commit on the feature branch (not PR open, not “ticket assigned”).
End: Merge to main, assuming CD ships main automatically. If you don’t have CD, end at deploy timestamp.
Aggregation: P50 and P85 over a rolling 30-day window.

Why first commit and not PR open? Because the gap between first commit and PR open often hides the real bottleneck — engineers waiting on environment setup, database migrations, or unclear requirements. Starting the clock at PR open hides that drag.

SMB Lead Time Benchmarks by Team Size#

From 30+ teams, anonymized:

Team size	P25 (best)	P50 (median)	P75 (slowest)
5–10 engineers	8 hours	1.5 days	4 days
11–25 engineers	1 day	2.5 days	6 days
26–50 engineers	2 days	4 days	9 days

Three observations from this data:

Lead Time grows with team size. This is not a sign of declining team quality — it’s the natural consequence of more code, more reviewers, more dependencies. Don’t punish the team for it.
The P25 to P75 spread widens with team size. Larger teams have more variance because they have more kinds of work (small bug fix vs. cross-team migration).
The median P50 for a 25-person team is slower than the worst P75 for a 10-person team. When founders compare their 25-person team to their memory of “back when we shipped fast,” they’re comparing apples to a different fruit.

Benchmarks by Vertical#

Same data, sliced by industry:

Vertical	P50 Lead Time	What drives it
Pure SaaS B2B	1.5 days	Fast review culture, mature CI
Fintech	4 days	Compliance review, dual-control deploys
Healthtech	5 days	HIPAA audit gates
Agency / Consultancy	6 days	Client approval gates per change
AI/ML platform	3 days	Long CI runs (model evaluation)

If you’re at the median for your vertical, you’re not slow — you’re normal. Vertical context dominates team-process context.

What Actually Affects Lead Time#

In our 30-team sample, four factors explain ~80% of Lead Time variance.

1. Median PR Review Wait Time#

The single highest-leverage variable. Cutting median review wait from 24 hours to 4 hours typically cuts Lead Time by 40–50%. The interventions that work:

Async review SLAs (e.g., “PRs older than 4 working hours get an @here ping in #review-please”).
Smaller PRs (target under 200 changed lines).
Reviewer rotation that doesn’t bottleneck on one senior engineer.

2. CI Pipeline Duration#

If your CI takes 25 minutes, Lead Time has a 25-minute floor. Median pipeline durations we see:

Pure SaaS: 4–8 minutes
Monolith with full test suite: 12–20 minutes
AI/ML with eval gates: 30–90 minutes

Below 5 minutes you stop getting velocity gains and start sacrificing test coverage. Don’t chase sub-3-minute CI for its own sake.

3. Merge Strategy#

Trunk-based development consistently beats long-lived feature branches by 30–50% on Lead Time. The reason isn’t ideological — it’s that long branches force big PRs, big PRs delay review, delayed review extends Lead Time. The arrows point one way.

4. Number of Reviewers Required#

One required reviewer is the SMB sweet spot. Two doubles wait time without doubling quality. We have data on this — a comparison study with two of our clients showed CFR was identical between 1- and 2-reviewer policies, but Lead Time was 1.7× longer with two.

How to Compare Yourself Honestly#

Run this check on your own data before drawing conclusions:

Compute your rolling 30-day P50 and P85 (not the average — see FAQ).
Filter out PRs with zero non-CI commits (auto-merges, dependabot, formatting fixes).
Look at the P50/P85 ratio. Healthy is 3–4×. If P85 is 10× your P50, you have a long-tail problem (probably one specific kind of work that’s silently broken).

Common Analysis Pitfalls#

Comparing P50 to last quarter without checking sample size. A team that merged 20 PRs last month has noisy data; a team that merged 200 has stable data.
Treating Lead Time spikes as failures. A spike during a security audit, holiday week, or major migration is expected. Annotate the dashboard, don’t act on it.
Optimizing Lead Time without watching CFR. Faster shipping at higher failure rate is not a win. The two metrics constrain each other; that’s the whole point of measuring both.

Next: Reduce CFR or Pick a Framework#

Once you understand your Lead Time number, the next two reads in this pillar:

How to Reduce Change Failure Rate Without Slowing Delivery — the partner metric to Lead Time.
DORA vs SPACE Framework: Which Fits an SMB Better? — when DORA stops being enough.

Or if you haven’t yet instrumented anything, start with How to Implement DORA Metrics in a 10-Person Team.