Power
power
Power analysis and sample-size planning calculators.
This module provides industry-standard power calculators for experimental design, helping experimenters determine the minimum required sample size per variant to detect a target Minimum Detectable Effect (MDE) with specified Type I and Type II error thresholds (\(\alpha, \beta\)). It also handles variance reduction credit (CUPED sample-size deflation) and estimated experiment runtimes.
Mathematical Specifications
The required sample size per variant \(n\) for a two-sample t-test is given by: $$ n = \frac{2 \sigma^2 \left(Z_{1 - \alpha/2} + Z_{1 - \beta}\right)^2}{\delta^2} $$ where: - \(\sigma^2\): Population variance. For binary proportions (\(p\)), \(\sigma^2 = p(1 - p)\). - \(Z_{1 - \alpha/2}\): Standard normal critical value for a two-sided test at significance level \(\alpha\). - \(Z_{1 - \beta}\): Standard normal quantile corresponding to the desired statistical power (\(1 - \beta\)). - \(\delta\): The target absolute Minimum Detectable Effect (MDE).
If pre-period baseline covariates are available, the CUPED variance-adjusted sample size is: $$ n_{\text{CUPED}} = n \left(1 - \rho^2\right) $$ where \(\rho\) is the correlation between pre-period and experiment-period values.
| CLASS | DESCRIPTION |
|---|---|
ExperimentDesignResult |
Class to hold, format, and present experiment design and statistical power analysis results. |
| FUNCTION | DESCRIPTION |
|---|---|
design_experiment |
Computes the required sample size and duration for an experiment based on design constraints. |
generate_power_curve_data |
Generates sample size coordinates across a range of relative MDE values. |
ExperimentDesignResult
Class to hold, format, and present experiment design and statistical power analysis results.
Provides high-fidelity text-based representations and structured summaries of design outputs to help experimenters evaluate sizing requirements, potential CUPED savings, and run runtimes.
| ATTRIBUTE | DESCRIPTION |
|---|---|
details |
Dictionary containing raw parameter values and sizing outputs from the power analysis engine.
TYPE:
|
| PARAMETER | DESCRIPTION |
|---|---|
details
|
Dictionary of design details from
TYPE:
|
| METHOD | DESCRIPTION |
|---|---|
summary |
Compiles a structured summary of the experiment design parameters. |
__repr__ |
Generates an aesthetic text block summary of the experiment design parameters. |
Source code in src\xpyrment\plan\power.py
summary
Compiles a structured summary of the experiment design parameters.
Formats raw numbers into clear, readable text elements (e.g., currency, percentages, and human-readable sample sizes with digit grouping).
| RETURNS | DESCRIPTION |
|---|---|
Dict[str, list]
|
Dict[str, list]: A dictionary mapping parameter names to formatted value strings. |
Source code in src\xpyrment\plan\power.py
__repr__
Generates an aesthetic text block summary of the experiment design parameters.
Source code in src\xpyrment\plan\power.py
design_experiment
design_experiment(
metric_type: str,
baseline_value: float,
standard_deviation: Optional[float] = None,
mde: float = 0.05,
mde_type: str = "relative",
alpha: float = 0.05,
power: float = 0.8,
pre_post_correlation: Optional[float] = None,
daily_traffic: Optional[int] = None,
) -> ExperimentDesignResult
Computes the required sample size and duration for an experiment based on design constraints.
This function performs rigorous a priori power analysis to determine required sample sizes. It supports continuous means, proportions, and ratio metrics, integrates pre-post correlation for CUPED calculation, and maps sizes to daily traffic to compute duration.
| PARAMETER | DESCRIPTION |
|---|---|
metric_type
|
The statistical distribution type. Options are
TYPE:
|
baseline_value
|
The current historical control group value (mean or rate).
TYPE:
|
standard_deviation
|
The historical standard deviation of the metric.
Required for
TYPE:
|
mde
|
The target Minimum Detectable Effect. Expressed as a fraction of baseline for
TYPE:
|
mde_type
|
Dictates how
TYPE:
|
alpha
|
The probability of a Type I error (significance level, e.g., 0.05 for 95% confidence). Defaults to 0.05.
TYPE:
|
power
|
The desired statistical power (\(1 - \beta\), e.g., 0.80 to capture true effects 80% of the time). Defaults to 0.80.
TYPE:
|
pre_post_correlation
|
The correlation coefficient (\(\rho\)) between baseline pre-period values and active experiment-period values. If provided, calculates CUPED-deflated sizing. Defaults to None.
TYPE:
|
daily_traffic
|
Expected daily volume of unique units entering the experiment. If provided, calculates duration. Defaults to None.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ExperimentDesignResult
|
A wrapper object containing formatted parameters and sample sizing calculations.
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If statistical inputs are out of logical bounds (e.g., negative traffic, proportion baseline not in \((0, 1)\), or correlation not in \([-1, 1]\)). |
ValueError
|
If standard deviation is missing for mean/ratio metrics. |
Examples:
Example
Source code in src\xpyrment\plan\power.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 | |
generate_power_curve_data
generate_power_curve_data(
metric_type: str,
baseline_value: float,
standard_deviation: Optional[float] = None,
alpha: float = 0.05,
power: float = 0.8,
mde_range: Optional[ndarray] = None,
pre_post_correlation: Optional[float] = None,
) -> Dict[str, ndarray]
Generates sample size coordinates across a range of relative MDE values.
This function calculates required sizing across a coordinate spectrum of possible MDEs, allowing downstream reporting tools to plot an interactive or static "power curve" graph (sample size vs. effect size).
| PARAMETER | DESCRIPTION |
|---|---|
metric_type
|
Metric type ('mean', 'proportion', 'ratio').
TYPE:
|
baseline_value
|
Historical control average value.
TYPE:
|
standard_deviation
|
Historical metric standard deviation. Required for continuous.
TYPE:
|
alpha
|
Significance level. Defaults to 0.05.
TYPE:
|
power
|
Desired statistical power. Defaults to 0.80.
TYPE:
|
mde_range
|
Array of relative MDE points to evaluate. If not provided, evaluates 50 linear coordinates in \([0.01, 0.15]\). Defaults to None.
TYPE:
|
pre_post_correlation
|
Pre-post correlation coefficient for CUPED-adjusted curve. Defaults to None.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Dict[str, ndarray]
|
Dict[str, np.ndarray]: Dictionary mapping coordinate names to numpy arrays of results.
Contains keys |