P-Value

The probability of observing a result as extreme as the one measured, assuming the null hypothesis is true. A small p-value (typically below 0.05) suggests the observed difference is unlikely due to chance alone.

Also known as: probability value, significance level result

Formula

Z = (p1 - p2) / sqrt(p_pool * (1 - p_pool) * (1/n1 + 1/n2))

Try Calculator

Why It Matters

The p-value is the most commonly used (and misunderstood) statistic in experimentation. It answers a specific question: "If there were truly no difference between control and variant, how likely would I be to see a result this extreme?" A p-value of 0.03 means there is a 3% chance of seeing this result by random chance alone.

P-values help you make go/no-go decisions about test results. The conventional threshold of 0.05 (5%) means you accept a 1-in-20 chance of a false positive. For high-stakes decisions, you might use 0.01 (1%). For exploratory tests, 0.10 (10%) might be acceptable.

However, p-values do not tell you the magnitude of the effect, the probability that your hypothesis is true, or whether the result is practically meaningful. A p-value of 0.001 for a 0.01% conversion rate improvement is statistically significant but practically worthless. Always pair p-values with effect size and confidence intervals.

How to Calculate

P-values are calculated from the test statistic (like a z-score or t-statistic) based on the observed difference between groups, the sample sizes, and the variance within each group. Most experimentation platforms, including KISSmetrics, calculate p-values automatically. The calculation depends on the type of test: for conversion rate comparisons, a chi-squared or z-test for proportions is typically used.

Z-Test P-Value Approximation Calculator

Z = (p1 - p2) / sqrt(p_pool * (1 - p_pool) * (1/n1 + 1/n2))

Z-Score2.1932

Industry Applications

E-commerce

A DTC brand tests two pricing strategies. The test shows p = 0.02 for revenue per visitor, indicating strong evidence that the new pricing outperforms the old. Combined with a 12% effect size and positive confidence interval, they confidently roll out the change.

SaaS

A B2B tool tests three different onboarding emails. After applying Bonferroni correction for multiple comparisons (threshold of 0.017 instead of 0.05), only one variant achieves significance with p = 0.008, avoiding a false positive from testing multiple variants.

How to Track in KISSmetrics

KISSmetrics displays p-values (or equivalent significance indicators) in experiment results. Monitor the p-value as your experiment accumulates data, but resist the urge to stop the test the moment it crosses 0.05. Pre-commit to a sample size and run the test to completion for valid results.

Common Mistakes

  • -Interpreting the p-value as the probability that the variant is better - it is not; it is the probability of the observed data given no real difference
  • -Peeking at p-values during a test and stopping early when they cross the threshold, which inflates false positive rates
  • -Treating p = 0.05 as a hard boundary where 0.049 means "significant" and 0.051 means "not significant" - the difference is trivial
  • -Ignoring effect size and focusing solely on whether the p-value crossed the threshold
  • -Running multiple comparisons without adjusting the p-value threshold (Bonferroni correction or similar)

Pro Tips

  • +Always report p-values alongside confidence intervals and effect sizes for a complete picture
  • +Use sequential testing methods if you need to monitor results during the test without inflating false positive rates
  • +Adjust your significance threshold based on the cost of a false positive - higher stakes warrant lower p-value thresholds
  • +Remember that a high p-value does not prove no effect; it may indicate an underpowered test
  • +Consider using Bayesian methods for a more intuitive interpretation of experiment results

Related Terms

See P-Value in action

KISSmetrics tracks every user across sessions and devices so you can measure what matters. Start free - no credit card required.