Confidence Interval

A range of values that likely contains the true effect of a change, calculated from experiment data. A 95% confidence interval means that if the experiment were repeated many times, 95% of the calculated intervals would contain the true value.

Also known as: CI, error margin

Why It Matters

While p-values give you a yes/no answer about significance, confidence intervals tell you the likely range of the true effect. This is far more useful for decision-making. Knowing that your new checkout flow improves conversion "somewhere between 2% and 8%" is much more actionable than just knowing "it is significant."

Confidence intervals reveal the precision of your estimate. A narrow interval (3% to 5% improvement) means you have a reliable estimate. A wide interval (-1% to 15% improvement) means you have high uncertainty despite potentially having a significant p-value. Width depends primarily on sample size - more data narrows the interval.

Confidence intervals also make it easier to assess practical significance. If the entire interval is above your minimum meaningful improvement (say, 2%), you can be confident the change is worth shipping. If the interval includes values below your threshold, the true effect might be too small to matter even though it is statistically detectable.

How to Calculate

A confidence interval is calculated as the observed effect plus or minus a margin of error. The margin of error equals the critical value (1.96 for 95% confidence) multiplied by the standard error of the estimate. For conversion rate experiments, the standard error depends on the observed proportions and sample sizes of both groups.

Industry Applications

E-commerce

A home furnishing retailer tests a new product recommendation widget. The result shows an average revenue lift of $2.50 per session with a 95% confidence interval of [$1.80, $3.20]. Since the entire interval is positive and above their $1.00 minimum threshold, they roll out with high confidence.

SaaS

A productivity app tests a new trial onboarding flow. The conversion lift is 4% with a 95% CI of [-1%, 9%]. Despite the positive point estimate, the interval includes zero and negative values, so the team decides to iterate on the variant rather than ship it.

How to Track in KISSmetrics

KISSmetrics experiment reports include confidence intervals alongside point estimates and p-values. When evaluating test results, focus on the confidence interval rather than just the point estimate. A result with a narrow confidence interval entirely above zero gives you much more certainty than a large point estimate with a wide interval that overlaps zero.

Common Mistakes

  • -Ignoring confidence intervals and focusing only on the point estimate of the effect
  • -Misinterpreting "95% confidence interval" as "95% probability the true value is in this range" - it refers to the procedure, not this specific interval
  • -Not considering whether the confidence interval is practically meaningful, even when it excludes zero
  • -Using default confidence levels (95%) without considering whether 90% or 99% is more appropriate for the decision at hand

Pro Tips

  • +Report confidence intervals in executive summaries instead of just point estimates to convey the uncertainty in your results
  • +If your confidence interval is too wide, increase sample size rather than lowering your confidence level
  • +Compare the confidence interval against your minimum detectable effect to determine if the test was adequately powered
  • +Use confidence intervals to set expectations: "we expect this change to improve conversion by 3-7%"

Related Terms

See Confidence Interval in action

KISSmetrics tracks every user across sessions and devices so you can measure what matters. Start free - no credit card required.