Skip to content

Design Module

The xpyrment.design module contains submodules and components for design.

design

Experimental design, user routing, traffic splits, randomization, and Design of Experiments (DoE).

This package provides utilities for configuring study designs, routing experimental traffic, setting up randomization schemes, and classical Design of Experiments (DoE) matrices: - hash_assign: Cryptographic deterministic hashing engine to route units state-lessly. - TrafficSplitter: Coordinates multi-variant allocations, long-term holdouts, and progressive ramps. - stratified_randomization: Ensures structural balance on pre-experiment covariates. - doe: Comprehensive classical design matrices (factorial, fractional, Taguchi, DSD, CCD, mixture, etc.).

MODULE DESCRIPTION
doe

Classical Design of Experiments (DoE) generators and optimization engines.

power

Analytical Power Analysis & Sample Size Estimator (Block 44).

randomization

Deterministic, hash-based randomization engines for user and unit assignments.

splits

Traffic split coordinators, holdout groups, and exposure ramp schedules.

stratification

Stratified and clustered randomization engines for balanced covariate distributions.

CLASS DESCRIPTION
TrafficSplitter

Manages traffic split fractions, global holdout configurations, and progressive exposure ramp schedules.

AnalyticalPowerCalculator

Computes sample sizes, statistical power, and Minimum Detectable Effects (MDE).

FUNCTION DESCRIPTION
hash_assign

Assigns a unit to a variant deterministically using MD5 hashing and modulo arithmetic.

stratified_randomization

Performs stratified randomization to ensure balance on continuous/categorical covariates.

TrafficSplitter

TrafficSplitter(
    allocations: Dict[str, float],
    holdout_percentage: float = 0.0,
    ramp_schedule: List[float] = None,
)

Manages traffic split fractions, global holdout configurations, and progressive exposure ramp schedules.

A TrafficSplitter divides incoming traffic into discrete experimental buckets. It supports fractional allocations, standard control and treatment arms, and long-term holdout groups.

Business and Operational Context
  1. Fractional Allocations: Allows unequal variants (e.g., 90% control, 10% treatment) to limit exposure during early launch states.
  2. Long-Term Holdouts: A portion of users can be entirely withheld from all experiments within a product area (the "holdout group") to measure the long-term cumulative effects and potential interaction effects of independent product changes over months.
  3. Exposure Ramp Schedules: Step-wise exposure schedules (e.g., exposing 1% of users, then 10%, then 50%, then 100%) act as standard release gates. This mitigates operational risk by validating telemetry and checking for guardrail breaches before exposing the entire user base.
ATTRIBUTE DESCRIPTION
allocations

A dictionary mapping variant names to their allocation weight fractions (values bounded in \([0, 1]\)).

TYPE: Dict[str, float]

holdout_percentage

Percentage of global traffic completely excluded from selection and routed to a static holdout arm. Bounded in \([0, 1]\).

TYPE: float

Examples:

Example
>>> allocations = {"control": 0.45, "treatment": 0.45}
>>> splitter = TrafficSplitter(allocations=allocations, holdout_percentage=0.10)
>>> splitter.holdout_percentage
0.1

Validates that the sum of variant allocations and holdout percentages totals exactly 1.0.

PARAMETER DESCRIPTION
allocations

Mapping of variant labels to active traffic split weights.

TYPE: Dict[str, float]

holdout_percentage

Fractional traffic diverted into a holdout group. Defaults to 0.0.

TYPE: float DEFAULT: 0.0

ramp_schedule

Custom progressive exposure percentages. Defaults to None.

TYPE: List[float] DEFAULT: None

RAISES DESCRIPTION
ValueError

If any individual allocation weight is negative.

ValueError

If the total allocation weight including the holdout percentage does not sum to 1.0.

ValueError

If ramp_schedule has values outside [0.0, 1.0], is non-monotonic, or does not end in 1.0.

METHOD DESCRIPTION
get_ramp_schedule

Generates progressive exposure ramp-up schedule coordinates.

Source code in src\xpyrment\design\splits.py
def __init__(
    self,
    allocations: Dict[str, float],
    holdout_percentage: float = 0.0,
    ramp_schedule: List[float] = None,
):
    """Initializes a new TrafficSplitter container.

    Validates that the sum of variant allocations and holdout percentages totals exactly 1.0.

    Args:
        allocations (Dict[str, float]): Mapping of variant labels to active traffic split weights.
        holdout_percentage (float): Fractional traffic diverted into a holdout group. Defaults to 0.0.
        ramp_schedule (List[float], optional): Custom progressive exposure percentages. Defaults to None.

    Raises:
        ValueError: If any individual allocation weight is negative.
        ValueError: If the total allocation weight including the holdout percentage does not sum to 1.0.
        ValueError: If ramp_schedule has values outside [0.0, 1.0], is non-monotonic, or does not end in 1.0.
    """
    self.allocations = allocations
    self.holdout_percentage = holdout_percentage
    self.ramp_schedule = ramp_schedule

    # Validate bounds
    if holdout_percentage < 0.0 or holdout_percentage > 1.0:
        raise ValueError("holdout_percentage must be between 0.0 and 1.0.")
    for variant, weight in allocations.items():
        if weight < 0.0 or weight > 1.0:
            raise ValueError(f"Allocation for variant '{variant}' must be between 0.0 and 1.0.")

    total_alloc = sum(allocations.values()) + holdout_percentage
    if abs(total_alloc - 1.0) > 1e-5:
        raise ValueError("Total allocations including holdout must equal 1.0.")

    # Validate custom ramp-up schedule if provided
    if ramp_schedule is not None:
        if not ramp_schedule:
            raise ValueError("ramp_schedule list cannot be empty.")
        for val in ramp_schedule:
            if val < 0.0 or val > 1.0:
                raise ValueError(f"Ramp schedule value '{val}' must be between 0.0 and 1.0.")
        # Check monotonicity
        for i in range(len(ramp_schedule) - 1):
            if ramp_schedule[i] > ramp_schedule[i + 1]:
                raise ValueError("Ramp schedule values must be monotonically non-decreasing.")
        # Must terminate in 1.0
        if abs(ramp_schedule[-1] - 1.0) > 1e-5:
            raise ValueError("Ramp schedule must terminate at 1.0 (full exposure).")

get_ramp_schedule

get_ramp_schedule() -> List[float]

Generates progressive exposure ramp-up schedule coordinates.

Ramping exposure represents a crucial risk-management strategy. This method returns a list of active exposure fractions defining sequential release gating stages.

Mathematical Representation

Let \(R\) be the ramp-up schedule array: $$ R = [r_1, r_2, \dots, r_m] $$ where \(r_j \in [0, 1]\) represents the fraction of total traffic exposed to the experiment during step \(j\), such that \(r_j \le r_{j+1}\).

RETURNS DESCRIPTION
List[float]

List[float]: Ordered list of target traffic fractions representing progressive exposure stages.

Source code in src\xpyrment\design\splits.py
def get_ramp_schedule(self) -> List[float]:
    r"""Generates progressive exposure ramp-up schedule coordinates.

    Ramping exposure represents a crucial risk-management strategy. This method returns a list of active exposure
    fractions defining sequential release gating stages.

    Mathematical Representation:
        Let $R$ be the ramp-up schedule array:
        $$
        R = [r_1, r_2, \dots, r_m]
        $$
        where $r_j \in [0, 1]$ represents the fraction of total traffic exposed to the experiment during step $j$,
        such that $r_j \le r_{j+1}$.

    Returns:
        List[float]: Ordered list of target traffic fractions representing progressive exposure stages.
    """
    if self.ramp_schedule is not None:
        return self.ramp_schedule
    return [0.01, 0.10, 0.50, 1.0]

AnalyticalPowerCalculator

AnalyticalPowerCalculator(
    alpha: float = 0.05, power: float = 0.8
)

Computes sample sizes, statistical power, and Minimum Detectable Effects (MDE).

TODO: Extend power calculations to unequal allocation ratios with multiple treatment arms using Dunnett's adjustment. TODO: Add exact simulation-based power curves utilizing empirical bootstrap baseline variance matrices.

PARAMETER DESCRIPTION
alpha

Targeted significance/Type I error rate. Defaults to 0.05.

TYPE: float DEFAULT: 0.05

power

Targeted power/1 - Type II error rate. Defaults to 0.80.

TYPE: float DEFAULT: 0.8

METHOD DESCRIPTION
compute_sample_size

Computes sample size per group for a two-sample t-test.

compute_mde

Computes Minimum Detectable Effect (MDE) under configured constraints.

compute_power

Computes statistical power given sample size and target treatment effect size.

adjust_for_clusters

Adjusts sample size requirements for clustered designs using the VIF model.

Source code in src\xpyrment\design\power.py
def __init__(self, alpha: float = 0.05, power: float = 0.80) -> None:
    """Initializes the power calculator.

    Args:
        alpha (float): Targeted significance/Type I error rate. Defaults to 0.05.
        power (float): Targeted power/1 - Type II error rate. Defaults to 0.80.
    """
    self.alpha = alpha
    self.power = power

compute_sample_size

compute_sample_size(
    mde: float, variance: float, ratio: float = 1.0
) -> int

Computes sample size per group for a two-sample t-test.

ratio = n_treatment / n_control

Source code in src\xpyrment\design\power.py
def compute_sample_size(self, mde: float, variance: float, ratio: float = 1.0) -> int:
    """Computes sample size per group for a two-sample t-test.

     ratio = n_treatment / n_control
    """
    if mde <= 0.0 or variance <= 0.0:
        raise ValueError("MDE and variance must be strictly positive values.")

    z_alpha = norm.ppf(1.0 - self.alpha / 2.0)
    z_beta = norm.ppf(self.power)

    # Standard formulation: n_control = (1 + 1/ratio) * ( (z_alpha + z_beta) * sigma / mde )^2
    n_ctrl = (1.0 + 1.0 / ratio) * ((z_alpha + z_beta) ** 2) * variance / (mde ** 2)
    return int(np.ceil(n_ctrl))

compute_mde

compute_mde(
    sample_size: int, variance: float, ratio: float = 1.0
) -> float

Computes Minimum Detectable Effect (MDE) under configured constraints.

Source code in src\xpyrment\design\power.py
def compute_mde(self, sample_size: int, variance: float, ratio: float = 1.0) -> float:
    """Computes Minimum Detectable Effect (MDE) under configured constraints."""
    if sample_size <= 0 or variance <= 0.0:
        raise ValueError("Sample size and variance must be strictly positive values.")

    z_alpha = norm.ppf(1.0 - self.alpha / 2.0)
    z_beta = norm.ppf(self.power)

    # mde = (z_alpha + z_beta) * sqrt( variance * (1 + 1/ratio) / n_control )
    factor = variance * (1.0 + 1.0 / ratio) / sample_size
    return float((z_alpha + z_beta) * np.sqrt(factor))

compute_power

compute_power(
    sample_size: int,
    mde: float,
    variance: float,
    ratio: float = 1.0,
) -> float

Computes statistical power given sample size and target treatment effect size.

Source code in src\xpyrment\design\power.py
def compute_power(self, sample_size: int, mde: float, variance: float, ratio: float = 1.0) -> float:
    """Computes statistical power given sample size and target treatment effect size."""
    if sample_size <= 0 or mde <= 0.0 or variance <= 0.0:
        raise ValueError("All arguments must be strictly positive to estimate power.")

    z_alpha = norm.ppf(1.0 - self.alpha / 2.0)

    # Power = Phi( mde / sqrt(variance * (1 + 1/ratio) / n_control) - z_alpha )
    se = np.sqrt(variance * (1.0 + 1.0 / ratio) / sample_size)
    z_val = (mde / se) - z_alpha
    return float(norm.cdf(z_val))

adjust_for_clusters

adjust_for_clusters(
    base_n: int, avg_cluster_size: int, icc: float
) -> int

Adjusts sample size requirements for clustered designs using the VIF model.

VIF = 1 + (M - 1) * ICC

Source code in src\xpyrment\design\power.py
def adjust_for_clusters(self, base_n: int, avg_cluster_size: int, icc: float) -> int:
    """Adjusts sample size requirements for clustered designs using the VIF model.

        VIF = 1 + (M - 1) * ICC
    """
    if avg_cluster_size <= 1:
        return base_n
    if not (0.0 <= icc <= 1.0):
        raise ValueError("Intra-cluster correlation (ICC) must be in the range [0, 1].")

    vif = 1.0 + (avg_cluster_size - 1) * icc
    return int(np.ceil(base_n * vif))

hash_assign

hash_assign(
    unit_id: Union[str, int], salt: str, variants: List[str]
) -> str

Assigns a unit to a variant deterministically using MD5 hashing and modulo arithmetic.

To distribute units (e.g., user IDs or device hashes) uniformly and orthogonally across multiple concurrent experiments without storing state, we concatenate a static experiment-level unique identifier (the "salt") with the unit's unique identifier. The resulting string is hashed, and the lower slice is mapped into the variant array using modulo arithmetic.

Mathematical Representation

Let \(u\) be the unit identifier, \(S\) be the unique experiment salt, and \(V = (v_1, v_2, \dots, v_k)\) be the ordered array of \(k\) variants. The assignment key is formed by concatenation: $$ K = S \mathbin{\Vert} \text{str}(u) $$ We compute the MD5 digest of \(K\) (yielding a 128-bit hex string) and extract the first 8 characters, representing a 32-bit integer \(H\): $$ H = \text{hex_to_int}(\text{MD5}(K)[0:8]) $$ The target variant index \(i\) is calculated using the modulo operator: $$ i = H \pmod{k} $$ The assigned variant is \(V_i\).

Properties of Hash-based Assignment
  1. Repeatability: For a given unit ID \(u\) and salt \(S\), the returned variant is always identical, eliminating the need for distributed database lookups.
  2. Uniformity: The MD5 hash exhibits avalanche-effect characteristics, distributing assignment probabilities uniformly: \(P(\text{Variant} = v) \approx 1/k\).
  3. Orthogonal Decorrelation: By using different salts for different experiments (\(S_A \neq S_B\)), user allocations in experiment A are statistically independent of their allocations in experiment B, preventing cross-contamination and carrying effects across concurrent runs.
PARAMETER DESCRIPTION
unit_id

Unique identifier for the experimental unit (e.g., visitor ID, account code).

TYPE: Union[str, int]

salt

A unique string identifying the active experiment. Acting as the cryptographic salt, this prevents correlation between different active experiments.

TYPE: str

variants

Ordered list of variant labels (e.g., ["control", "treatment_a", "treatment_b"]).

TYPE: List[str]

RETURNS DESCRIPTION
str

The selected variant label from the variants list.

TYPE: str

RAISES DESCRIPTION
ValueError

If the variants list is empty.

Examples:

Example
>>> variants = ["control", "variant_a", "variant_b"]
>>> hash_assign(unit_id="user_12345", salt="EXP-201_checkout_revamp", variants=variants)
'variant_a'
>>> hash_assign(unit_id="user_12345", salt="EXP-202_price_test", variants=variants) # Orthogonal salt
'control'
Source code in src\xpyrment\design\randomization.py
def hash_assign(unit_id: Union[str, int], salt: str, variants: List[str]) -> str:
    r"""Assigns a unit to a variant deterministically using MD5 hashing and modulo arithmetic.

    To distribute units (e.g., user IDs or device hashes) uniformly and orthogonally across multiple
    concurrent experiments without storing state, we concatenate a static experiment-level unique identifier
    (the "salt") with the unit's unique identifier. The resulting string is hashed, and the lower slice is mapped
    into the variant array using modulo arithmetic.

    Mathematical Representation:
        Let $u$ be the unit identifier, $S$ be the unique experiment salt, and $V = (v_1, v_2, \dots, v_k)$
        be the ordered array of $k$ variants.
        The assignment key is formed by concatenation:
        $$
        K = S \mathbin{\Vert} \text{str}(u)
        $$
        We compute the MD5 digest of $K$ (yielding a 128-bit hex string) and extract the first 8 characters,
        representing a 32-bit integer $H$:
        $$
        H = \text{hex\_to\_int}(\text{MD5}(K)[0:8])
        $$
        The target variant index $i$ is calculated using the modulo operator:
        $$
        i = H \pmod{k}
        $$
        The assigned variant is $V_i$.

    Properties of Hash-based Assignment:
        1. **Repeatability**: For a given unit ID $u$ and salt $S$, the returned variant is always identical,
           eliminating the need for distributed database lookups.
        2. **Uniformity**: The MD5 hash exhibits avalanche-effect characteristics, distributing assignment
           probabilities uniformly: $P(\text{Variant} = v) \approx 1/k$.
        3. **Orthogonal Decorrelation**: By using different salts for different experiments ($S_A \neq S_B$),
           user allocations in experiment A are statistically independent of their allocations in experiment B,
           preventing cross-contamination and carrying effects across concurrent runs.

    Args:
        unit_id (Union[str, int]): Unique identifier for the experimental unit (e.g., visitor ID, account code).
        salt (str): A unique string identifying the active experiment. Acting as the cryptographic salt,
            this prevents correlation between different active experiments.
        variants (List[str]): Ordered list of variant labels (e.g., `["control", "treatment_a", "treatment_b"]`).

    Returns:
        str: The selected variant label from the `variants` list.

    Raises:
        ValueError: If the `variants` list is empty.

    Examples:
        ??? example "Example"

            ```python
            >>> variants = ["control", "variant_a", "variant_b"]
            >>> hash_assign(unit_id="user_12345", salt="EXP-201_checkout_revamp", variants=variants)
            'variant_a'
            >>> hash_assign(unit_id="user_12345", salt="EXP-202_price_test", variants=variants) # Orthogonal salt
            'control'
            ```
    """
    if not variants:
        raise ValueError("variants list cannot be empty.")

    # Simple hash-based assignment
    hash_input = f"{salt}:{unit_id}".encode("utf-8")
    hash_hex = hashlib.md5(hash_input).hexdigest()
    # Take first 8 chars, convert to int, and modulo by variants length
    hash_val = int(hash_hex[:8], 16)
    idx = hash_val % len(variants)
    return variants[idx]

stratified_randomization

stratified_randomization(
    df: DataFrame,
    strata_cols: list,
    variants: Optional[List[str]] = None,
    treatment_col: str = "variant",
    random_state: Optional[int] = None,
) -> DataFrame

Performs stratified randomization to ensure balance on continuous/categorical covariates.

In simple randomization, small sample sizes or highly variable covariates can lead to accidental imbalance across treatment arms, introducing selection bias. Stratified randomization resolves this by dividing the population into mutually exclusive, homogeneous subgroups (strata) based on the provided covariates, and then executing independent randomization within each individual stratum.

Mathematical and Algorithmic Background

Let \(D\) be the dataset of size \(N\). Let \(C = \{c_1, c_2, \dots, c_p\}\) be the set of stratification columns. 1. Strata Construction: We partition \(D\) into \(K\) disjoint subsets (strata), \(\{D_1, D_2, \dots, D_K\}\), such that within each subset \(D_j\), all units share identical values for all stratification columns \(C\): $$ D = \bigcup_{j=1}^{K} D_j \quad \text{where} \quad D_a \cap D_b = \emptyset \ \ \forall \ a \neq b $$ 2. Intra-Stratum Randomization: For each stratum \(D_j\), units are randomly permuted and assigned to treatment arms. This guarantees that if treatment arm proportions are \(\{w_1, w_2, \dots, w_k\}\), then within every stratum \(D_j\), the assignment counts follow: $$ n_{j, \text{arm } i} \approx w_i \times |D_j| $$ This reduces the variance of the treatment effect estimator by removing the variance contribution of the stratification covariates.

PARAMETER DESCRIPTION
df

The input DataFrame containing experimental units and their covariate values.

TYPE: DataFrame

strata_cols

List of column names in df representing categorical or binned continuous covariates to use as stratification factors.

TYPE: list

variants

Ordered list of variant labels. Defaults to ["control", "treatment"].

TYPE: Optional[List[str]] DEFAULT: None

treatment_col

Column name where the assigned variant label will be written. Defaults to "variant".

TYPE: str DEFAULT: 'variant'

random_state

Integer seed to initialize local random state generator for reproducibility.

TYPE: Optional[int] DEFAULT: None

RETURNS DESCRIPTION
DataFrame

pd.DataFrame: A new DataFrame with treatment assignments balanced across the specified strata.

Source code in src\xpyrment\design\stratification.py
def stratified_randomization(
    df: pd.DataFrame,
    strata_cols: list,
    variants: Optional[List[str]] = None,
    treatment_col: str = "variant",
    random_state: Optional[int] = None,
) -> pd.DataFrame:
    r"""Performs stratified randomization to ensure balance on continuous/categorical covariates.

    In simple randomization, small sample sizes or highly variable covariates can lead to accidental
    imbalance across treatment arms, introducing selection bias. Stratified randomization resolves this
    by dividing the population into mutually exclusive, homogeneous subgroups (strata) based on the
    provided covariates, and then executing independent randomization within each individual stratum.

    Mathematical and Algorithmic Background:
        Let $D$ be the dataset of size $N$. Let $C = \{c_1, c_2, \dots, c_p\}$ be the set of stratification
        columns.
        1. **Strata Construction**:
           We partition $D$ into $K$ disjoint subsets (strata), $\{D_1, D_2, \dots, D_K\}$, such that within each
           subset $D_j$, all units share identical values for all stratification columns $C$:
           $$
           D = \bigcup_{j=1}^{K} D_j \quad \text{where} \quad D_a \cap D_b = \emptyset \ \ \forall \ a \neq b
           $$
        2. **Intra-Stratum Randomization**:
           For each stratum $D_j$, units are randomly permuted and assigned to treatment arms. This guarantees that
           if treatment arm proportions are $\{w_1, w_2, \dots, w_k\}$, then within every stratum $D_j$, the assignment
           counts follow:
           $$
           n_{j, \text{arm } i} \approx w_i \times |D_j|
           $$
           This reduces the variance of the treatment effect estimator by removing the variance contribution of the
           stratification covariates.

    Args:
        df (pd.DataFrame): The input DataFrame containing experimental units and their covariate values.
        strata_cols (list): List of column names in `df` representing categorical or binned continuous
            covariates to use as stratification factors.
        variants (Optional[List[str]]): Ordered list of variant labels. Defaults to `["control", "treatment"]`.
        treatment_col (str): Column name where the assigned variant label will be written. Defaults to `"variant"`.
        random_state (Optional[int]): Integer seed to initialize local random state generator for reproducibility.

    Returns:
        pd.DataFrame: A new DataFrame with treatment assignments balanced across the specified strata.
    """
    if variants is None:
        variants = ["control", "treatment"]

    if not variants:
        raise ValueError("variants list cannot be empty.")

    # Instantiate isolated local generator
    rng = np.random.default_rng(random_state)
    assigned_df = df.copy()
    assigned_df[treatment_col] = None

    # Group by the specified strata columns
    for _, group in assigned_df.groupby(strata_cols):
        n_group = len(group)
        if n_group == 0:
            continue

        # Generate balanced sequence of variants of length n_group
        group_assignments = [variants[i % len(variants)] for i in range(n_group)]
        group_assignments = np.array(group_assignments)

        # Shuffle assignments using local generator to enforce seeding rules
        rng.shuffle(group_assignments)

        # Assign shuffled variant labels back to corresponding rows
        assigned_df.loc[group.index, treatment_col] = group_assignments

    return assigned_df