Balance
balance
Covariate balance checking and standardized mean differences (SMD).
This module provides diagnostic engines to evaluate whether the control and treatment groups are balanced across key pre-experiment covariates (demographics, platform, historical engagement), preventing confounding or pre-existing selection bias from skewing treatment estimates.
| FUNCTION | DESCRIPTION |
|---|---|
check_covariate_balance |
Computes Normalized Differences and t-tests to evaluate balance of pre-period covariates. |
check_covariate_balance
Computes Normalized Differences and t-tests to evaluate balance of pre-period covariates.
Verifies that pre-period characteristics are distributed symmetrically across treatment arms. While simple t-tests can be used, they are highly sensitive in online datasets: with large footprints, extremely tiny, practically negligible differences will yield highly significant p-values (\(p < 0.05\)). Therefore, we compute Standardized Mean Differences (SMD) as the primary effect size metric.
Mathematical Representation
- Standardized Mean Difference (SMD) for continuous covariates: Let \(\bar{X}_T\) and \(\bar{X}_C\) be the sample means of a covariate \(X\) in the treatment and control groups, and let \(s_T^2\) and \(s_C^2\) be their sample variances. $$ \text{SMD} = \frac{\bar{X}_T - \bar{X}_C}{\sqrt{\frac{s_T^2 + s_C^2}{2}}} $$
- Pearson Chi-Square Test for Independence for categorical covariates: Evaluates whether the proportion of units in each category is independent of treatment.
| PARAMETER | DESCRIPTION |
|---|---|
df
|
The experimental dataset containing units, treatment assignments, and covariates.
TYPE:
|
treatment_col
|
Column name identifying experimental groups/arms.
TYPE:
|
covariate_cols
|
List of column names representing categorical or continuous pre-experiment covariates.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict
|
A dictionary mapping each covariate name to a diagnostic sub-dictionary containing SMD, p-values, and balance classification tags.
TYPE:
|
Source code in src\xpyrment\validate\balance.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | |