Effect Size

effect_size

Standardized effect size computation, focusing on scale-free difference metrics.

This module provides functions to calculate standardized difference statistics, such as Cohen's d, to evaluate the practical (rather than just statistical) magnitude of experimental impacts.

FUNCTION	DESCRIPTION
`compute_cohens_d`	Computes standard standardized effect size using Cohen's d formula.

compute_cohens_d

compute_cohens_d(
    group_a: ndarray, group_b: ndarray
) -> float

Computes standard standardized effect size using Cohen's d formula.

Cohen's d (Cohen, 1988) is a standardized, scale-free effect size measure representing the difference between two group means in terms of standard deviation units. While p-values measure the statistical evidence against a null hypothesis (and are heavily dependent on sample size), Cohen's d measures the practical magnitude of the treatment effect, making it comparable across entirely different metrics or experiments.

Mathematical Formulation

Let $N_A$, $N_B$ be sample sizes, let $\\bar{X}_A$, $\\bar{X}_B$ be sample means, and let $s_A^2$, $s_B^2$ be the unbiased sample variances of the two experimental groups (Control A and Treatment B respectively).

The pooled sample standard deviation $s_{\\text{pooled}}$ is defined as: $$ s_{\text{pooled}} = \sqrt{\frac{(N_A - 1)s_A^2 + (N_B - 1)s_B^2}{N_A + N_B - 2}} $$ The Cohen's d statistic is computed as: $$ d = \frac{\bar{X}B - \bar{X}_A}{s{\text{pooled}}} $$

Standard Classification Heuristics: - $|d| < 0.2$: Negligible effect size. - $0.2 \\le |d| < 0.5$: Small effect size (e.g., most successful digital A/B tests). - $0.5 \\le |d| < 0.8$: Medium effect size. - $|d| \\ge 0.8$: Large effect size (indicates highly impactful, structural interventions).

PARAMETER	DESCRIPTION
`group_a`	1D array of outcomes for Control (Group A). TYPE: `ndarray`
`group_b`	1D array of outcomes for Treatment (Group B). TYPE: `ndarray`

RETURNS	DESCRIPTION
`float`	The calculated Cohen's d statistic. TYPE: `float`

Source code in src\xpyrment\interpret\effect_size.py

def compute_cohens_d(group_a: np.ndarray, group_b: np.ndarray) -> float:
    r"""Computes standard standardized effect size using Cohen's d formula.

    Cohen's d (Cohen, 1988) is a standardized, scale-free effect size measure representing the difference
    between two group means in terms of standard deviation units. While p-values measure the statistical
    evidence against a null hypothesis (and are heavily dependent on sample size), Cohen's d measures the
    *practical magnitude* of the treatment effect, making it comparable across entirely different metrics or experiments.

    Mathematical Formulation:
        Let $N_A$, $N_B$ be sample sizes, let $\\bar{X}_A$, $\\bar{X}_B$ be sample means, and let $s_A^2$, $s_B^2$ be the
        unbiased sample variances of the two experimental groups (Control A and Treatment B respectively).

        The pooled sample standard deviation $s_{\\text{pooled}}$ is defined as:
        $$
        s_{\\text{pooled}} = \\sqrt{\\frac{(N_A - 1)s_A^2 + (N_B - 1)s_B^2}{N_A + N_B - 2}}
        $$
        The Cohen's d statistic is computed as:
        $$
        d = \\frac{\\bar{X}_B - \\bar{X}_A}{s_{\\text{pooled}}}
        $$
    Standard Classification Heuristics:
        - $|d| < 0.2$: Negligible effect size.
        - $0.2 \\le |d| < 0.5$: Small effect size (e.g., most successful digital A/B tests).
        - $0.5 \\le |d| < 0.8$: Medium effect size.
        - $|d| \\ge 0.8$: Large effect size (indicates highly impactful, structural interventions).

    Args:
        group_a (np.ndarray): 1D array of outcomes for Control (Group A).
        group_b (np.ndarray): 1D array of outcomes for Treatment (Group B).

    Returns:
        float: The calculated Cohen's d statistic.
    """
    mean_a, mean_b = np.mean(group_a), np.mean(group_b)
    var_a, var_b = np.var(group_a, ddof=1), np.var(group_b, ddof=1)
    n_a, n_b = len(group_a), len(group_b)

    pooled_std = np.sqrt(((n_a - 1) * var_a + (n_b - 1) * var_b) / (n_a + n_b - 2))
    if pooled_std > 0:
        return (mean_b - mean_a) / pooled_std
    return 0.0