Sample Size

Sample size is the number of users or observations included in each variant of an experiment, determining the statistical power of the test and how confidently you can detect real differences between variants.

Also known as: test population, experiment size, n-size

Why It Matters

Sample size determines whether your experiment can actually answer the question you are asking. Too small a sample and you cannot distinguish real effects from random noise - you might conclude that a change had no impact when it actually improved conversion by 5%, or worse, declare a winner based on random fluctuation.

The required sample size depends on three factors: your baseline conversion rate, the minimum detectable effect (the smallest improvement worth detecting), and your desired confidence level. A test trying to detect a 1% relative improvement needs dramatically more traffic than one looking for a 10% improvement. This is why pre-test power analysis is essential - it prevents you from wasting time on tests that cannot produce reliable results.

Insufficient sample sizes are one of the most common experimentation mistakes. Teams run tests for a week, see a promising result, and ship the change - only to discover weeks later that the improvement disappears. This is not bad luck; it is a predictable consequence of making decisions with insufficient data.

How to Calculate

Sample size is calculated using a power analysis formula that considers four inputs: baseline conversion rate, minimum detectable effect, significance level (typically 95%), and statistical power (typically 80%). For example, with a 5% baseline conversion rate and a desire to detect a 10% relative improvement (to 5.5%), you need approximately 30,000 visitors per variant. Online sample size calculators make this straightforward.

Industry Applications

E-commerce

An ecommerce site with a 3.5% conversion rate wants to detect a 5% relative improvement (to 3.675%). Power analysis shows they need 88,000 visitors per variant. At 15,000 daily visitors to the tested page, the test will take 12 days - feasible and worthwhile.

SaaS

A SaaS trial signup page converts at 8%. The team wants to detect a 15% relative improvement. Sample size calculation shows they need 7,500 visitors per variant, achievable in 10 days with their traffic volume. They run the test for 14 days to capture a full two-week cycle.

How to Track in KISSmetrics

Calculate your required sample size before any test begins. Monitor daily traffic to the tested page in KISSmetrics to estimate how long the test will need to run. If the required runtime exceeds your patience, either accept a higher minimum detectable effect or find a higher-traffic page to test on.

Common Mistakes

  • -Not calculating required sample size before starting a test, leading to underpowered experiments with unreliable results.
  • -Using overall site traffic as the sample size estimate when the test only affects a subset of visitors.
  • -Reducing required sample size by lowering the confidence level below 90%, which dramatically increases false positive risk.
  • -Assuming equal sample sizes per variant - unequal splits can be more efficient when you want to limit exposure to a risky variant.

Pro Tips

  • +Use an online sample size calculator before every test to determine whether the test is feasible given your traffic volume.
  • +If required sample sizes are too large, focus on testing bigger changes that produce larger effects, which are detectable with smaller samples.
  • +Consider using a 90/10 or 80/20 split (giving more traffic to control) when testing risky changes, though this increases required total sample size.
  • +Factor in test duration - a test that needs 60 days of traffic introduces seasonal and temporal confounds that a 14-day test avoids.
  • +When testing on low-traffic pages, consider testing a broader metric (like overall site conversion) rather than page-specific metrics that need larger samples.

Related Terms

See Sample Size in action

KISSmetrics tracks every user across sessions and devices so you can measure what matters. Start free - no credit card required.