Minimum Detectable Effect
The smallest difference between control and variant that a test is designed to reliably detect, given its sample size, significance level, and desired statistical power.
Also known as: MDE, minimum effect size, sensitivity
Why It Matters
The MDE is the bridge between experiment design and business value. It answers: "Given my traffic volume and test duration, what is the smallest real improvement this test can find?" If the MDE is 10% but you expect a 3% improvement, your test will almost certainly return "not significant" even if the improvement is real.
Setting the right MDE requires balancing ambition with reality. A lower MDE (detecting smaller effects) requires exponentially more traffic. For a 5% baseline conversion rate, detecting a 1% relative lift needs about 15x more traffic than detecting a 10% relative lift.
The MDE should be driven by business relevance, not arbitrary convention. Calculate the lift needed to justify the cost of the change: if a variant requires a week of engineering to build and maintain, what revenue lift makes that investment positive? That is your minimum meaningful effect, and your MDE should be at or below it.
How to Calculate
MDE is determined by your sample size, baseline conversion rate, significance level (alpha), and power (1 - beta). For a given sample size, MDE = (Z_alpha/2 + Z_beta) * sqrt(2 * p * (1-p) / n), where p is the baseline rate and n is the sample per group. Most teams use online calculators that let you input sample size and output MDE, or input MDE and output required sample size.
Industry Applications
A niche retailer with 10,000 monthly visitors and a 2% conversion rate calculates their MDE at 30% relative lift for a 4-week test. This means they can only detect large improvements, so they focus on radical redesigns rather than incremental tweaks.
Benchmark: High-traffic sites can detect 2-5% MDE; low-traffic sites may need 20%+ MDE
A product-led growth company with 3,000 weekly signups calculates they can detect a 7% relative lift in activation within 2 weeks. They use this MDE to filter their experiment backlog, only running tests expected to produce at least a 7% improvement.
How to Track in KISSmetrics
Before launching any experiment in KISSmetrics, calculate your MDE based on available traffic. If the MDE is larger than the effect you expect, either increase the test duration, increase traffic to the test, or test a bolder variant. KISSmetrics traffic data helps you estimate available sample sizes for accurate MDE calculations.
Common Mistakes
- -Not calculating MDE before launching, then being surprised when the test is inconclusive
- -Setting MDE unrealistically low for the available traffic, resulting in tests that run for months
- -Confusing MDE with the expected effect - MDE is a design parameter, expected effect is a prediction
- -Ignoring that MDE increases as baseline rates move toward 0% or 100% (extreme base rates need more traffic)
Pro Tips
- +Create an MDE reference table for your key pages: "With our homepage traffic, a 2-week test can detect a 5% relative lift in signups"
- +If your MDE is too large for subtle tests, use composite metrics (like revenue per visitor) that capture multiple improvements simultaneously
- +Recalculate MDEs when traffic patterns change (seasonal shifts, marketing spend changes) to keep test planning accurate
- +Use MDE to prioritize your testing roadmap - test bold ideas on low-traffic pages and subtle refinements on high-traffic ones
Related Terms
Effect Size
A quantitative measure of the magnitude of a difference between groups in an experiment, independent of sample size. It answers the question "how big is the improvement?" rather than "is there an improvement?"
Statistical Power
The probability that a test will correctly detect a real effect when one exists, typically set at 80% as a minimum standard. Higher power means a lower chance of missing genuine improvements.
Type II Error
A false negative in hypothesis testing - failing to reject the null hypothesis and concluding that a change had no effect when it actually did produce a real improvement.
Confidence Level
The percentage probability that a confidence interval calculated from a given experiment will contain the true population parameter, commonly set at 90%, 95%, or 99% in A/B testing.
Hypothesis Testing
A statistical method used to determine whether observed differences in data - such as a higher conversion rate in a test variant - are likely real or could have occurred by random chance.
See Minimum Detectable Effect in action
KISSmetrics tracks every user across sessions and devices so you can measure what matters. Start free - no credit card required.