Core Module

The xpyrment.core module contains submodules and components for core.

core

Core engine abstractions, state management, exception classes, and shared types.

This package provides the foundational structural mechanisms for the xpyrment package: - Experiment: The central orchestration state container that governs execution. - ExperimentState: The rigid phase-gating mechanism (CREATED -> PLANNED -> DESIGNED -> RUNNING -> ANALYZED -> REPORTED). - ExperimentRegistry: Cryptographic hashing and pre-registration validator to prevent post-hoc changes. - Custom Exceptions: Robust, informative error feedback to protect experimental integrity (PhaseOrderError, SRMError, AliasError). - Strict Typing schemas: Standardized TypedDict representation (MetricResult) of calculation outputs.

MODULE	DESCRIPTION
`exceptions`	Custom exception classes for the xpyrment core system.
`experiment`	Central orchestrator for experiment setup, configuration, and phase management.
`registry`	Preregistration registry for locking and verifying experiment specifications.
`serialization`	Robust serialization utilities to guarantee native Python and JSON compatibility (Block 52).
`state`	State machine and phase-gating representation for experiment lifecycles.
`telemetry`	Centralized Telemetry, Structured JSON Logging, and Execution Profiling (Block 56).
`types`	Core type definitions, TypeDicts, and Literals for the xpyrment library.

CLASS	DESCRIPTION
`PhaseOrderError`	Raised when an operation is performed in an invalid state/phase.
`SRMError`	Raised when a Sample Ratio Mismatch (SRM) is detected during validation.
`AliasError`	Raised when fractional factorial alias confounding is violated or misconfigured.
`Experiment`	The central orchestration class for setting up, configuring, and executing experiments.
`ExperimentRegistry`	Manages immutable experiment specifications to prevent post-hoc changes (pre-registration).
`ExperimentState`	Enforces the phase-gated state of the experimental lifecycle.
`MetricResult`	The canonical data schema representing the output of a statistical metric analysis.
`ExecutionProfiler`	Context manager and decorator tracking processing stages, execution duration, and peak memory usage.

FUNCTION	DESCRIPTION
`make_serializable`	Recursively converts numpy and non-serializable objects to native, standard JSON-compliant types.
`serialize_to_json`	Converts a nested object recursively to serializable format and dumps it as a JSON string.
`configure_telemetry`	Configures and registers the centralized JSON telemetry logger handlers.
`get_logger`	Returns the centralized JSON telemetry logger instance.

ATTRIBUTE	DESCRIPTION
`MetricType`	Literal representing the supported category of metrics.

MetricType `module-attribute`

MetricType = Literal[
    "mean", "proportion", "ratio", "revenue"
]

Literal representing the supported category of metrics.

Supported Types

"mean": A continuous or discrete numeric metric where statistics are calculated on a per-unit basis (e.g., average sessions per user, average page views).
"proportion": A binary rate metric representing yes/no outcomes on a per-unit basis, equivalent to a Bernoulli trial (e.g., conversion rate, click-through-rate where the unit of analysis is the user).
"ratio": An aggregated metric computed as the sum of a numerator divided by the sum of a denominator across all units (e.g., global Click-Through-Rate = total clicks / total impressions). Requires Delta Method for proper variance approximation.
"revenue": A highly skewed continuous monetary metric (e.g., revenue per user, average order value). Often subject to log-transformations or specialized variance reduction.

PhaseOrderError

Bases: Exception

Raised when an operation is performed in an invalid state/phase.

This exception is a core mechanism of the phase-gated execution flow. It prevents: - Downstream actions (e.g., calling .analyze() or .report()) from being executed before upstream requirements (e.g., .design() or .validate()) are complete. - Upstream state reversals (e.g., transitioning back to CREATED or PLANNED once an experiment is already RUNNING or ANALYZED), which could lead to post-hoc configuration tampering or invalid statistical analysis.

Mathematical & Operational Context: The experimental lifecycle is governed by a strict directed, non-cyclic graph of transitions: CREATED -> PLANNED -> DESIGNED -> RUNNING -> ANALYZED -> REPORTED. Any transition where index(target_state) < index(current_state) violates this unidirectional flow and triggers this error (with the exception of re-running the analysis on the same or new frozen data, which remains in the ANALYZED state).

ATTRIBUTE	DESCRIPTION
`message`	Explains the invalid state transition attempt and the active state. TYPE: `str`

Examples:

Example

>>> from xpyrment.core.state import ExperimentState
>>> from xpyrment.core.exceptions import PhaseOrderError
>>> raise PhaseOrderError("Cannot transition backwards from RUNNING to PLANNED.")

SRMError

Bases: Exception

Raised when a Sample Ratio Mismatch (SRM) is detected during validation.

SRM occurs when the observed ratio of sample counts assigned to treatment arms significantly deviates from the pre-specified expected ratio (e.g., 50/50 split). An SRM is a critical indicator of data quality issues, selection bias, or bugs in the randomization/assignment mechanism.

Mathematical Background

A Pearson Chi-Square Goodness-of-Fit test is performed to evaluate the discrepancy: $$ \chi^2 = \sum_{i=1}^{k} \frac{(O_i - E_i)^2}{E_i} $$ where $O_i$ is the observed count in arm $i$ and $E_i$ is the expected count under the planned split. The degrees of freedom is $k - 1$. This exception is raised if the resulting p-value is extremely small (typically p < 0.001), indicating that the deviation is highly unlikely to have occurred by chance alone.

Remediation Procedures

Halt the experiment or analysis immediately.
Audit the randomization and unit assignment pipeline for bugs.
Check for data loss, telemetry issues, or delay in event ingestion pipelines.
Validate that the assignment tracking log captures all users on first touch.

ATTRIBUTE	DESCRIPTION
`message`	Explains the observed vs. expected sample counts and the p-value. TYPE: `str`

AliasError

Bases: Exception

Raised when fractional factorial alias confounding is violated or misconfigured.

In fractional factorial classical Design of Experiments (DoE), only a fraction of all possible factor combinations is run. This results in confounding (aliasing), where the estimate of a specific main effect is mathematically indistinguishable from a multi-factor interaction term.

This exception is raised when: - A user tries to estimate an effect that is completely confounded with another active effect of equal or lower order (violating the specified design Resolution). - The defined alias structure does not match the actual combinations present in the design matrix. - The design resolution (III, IV, or V) is insufficient to support the hypothesis or interaction analysis requested.

Mathematical Context

Let $X$ be the design matrix and $C = (X^T X)^{-1} X^T Y$ be the parameter estimates. If the design is fractional, some columns of $X$ are linear combinations of others, leading to a rank-deficient matrix where unique solutions for all factors and interactions do not exist. The alias relation matrix $A$ defines which terms are confounded: $$ E[\hat{\beta}_1] = \beta_1 + A \beta_2 $$ An AliasError prevents the system from proceeding with invalid or unresolvable confounding structures.

ATTRIBUTE	DESCRIPTION
`message`	Details the confounded factors or resolution constraint violated. TYPE: `str`

Experiment

Experiment(
    data: DataFrame,
    treatment_col: str,
    id_col: Optional[str] = None,
    covariates: Optional[List[str]] = None,
)

The central orchestration class for setting up, configuring, and executing experiments.

The Experiment class binds the experimental dataset, defines treatment structures, maps the metric taxonomy, and strictly enforces state transitions across the execution lifecycle. Through the state-machine rules, it ensures that all calculations are performed sequentially and reproducibly, eliminating retrospective tampering or incorrect state usage.

ATTRIBUTE	DESCRIPTION
`data`	A copy of the input DataFrame containing assignments and telemetry. TYPE: `DataFrame`
`treatment_col`	The column in `data` identifying the treatment arm assignments. TYPE: `str`
`id_col`	The column in `data` representing unique unit IDs. TYPE: `Optional[str]`
`metrics`	List of metrics registered for statistical calculation. TYPE: `List[BaseMetric]`
`state`	The current lifecycle phase of the experiment. TYPE: `ExperimentState`

State Gating Mechanism

Execution functions across downstream submodules verify that the experiment is in the appropriate state before proceeding. For example, running power analysis transitions the state from CREATED to PLANNED. Running randomization moves from PLANNED to DESIGNED. Analyzing results requires a transition to ANALYZED.

Examples:

Example

>>> import pandas as pd
>>> from xpyrment import Experiment
>>> from xpyrment.metrics.taxonomy import MeanMetric
>>> df = pd.DataFrame({"user_id": [1, 2, 3], "group": ["control", "treatment", "control"], "revenue": [10.5, 12.0, 9.5]})
>>> exp = Experiment(df, treatment_col="group", id_col="user_id")
>>> exp.state
<ExperimentState.CREATED: 'CREATED'>
>>> metric = MeanMetric("Revenue Metric", value_col="revenue")
>>> exp.add_metrics(metric)
>>> exp.state
<ExperimentState.PLANNED: 'PLANNED'>

Copies the input DataFrame to guarantee immutability of the source dataset during internal state transitions and potential data transformations (e.g., CUPED alignment or log scaling).

PARAMETER	DESCRIPTION
`data`	The source DataFrame containing unit-level data. TYPE: `DataFrame`
`treatment_col`	Name of the column designating experimental groups/arms. TYPE: `str`
`id_col`	Name of the column containing unique identifiers for each experimental unit. Required for certain operations like sequential analysis and user assignments. TYPE: `Optional[str]` DEFAULT: `None`
`covariates`	List of baseline covariates for balance checking or adjustments. TYPE: `Optional[List[str]]` DEFAULT: `None`

RAISES	DESCRIPTION
`ValueError`	If `treatment_col` or `id_col` is not found in the input DataFrame columns.

METHOD	DESCRIPTION
`transition_to`	Enforces transition logic to guarantee the phase-gated execution flow.
`add_metrics`	Adds statistical metrics to the experiment configuration.
`add_covariates`	Adds baseline covariates to the experiment configuration.
`register_metric`	Conveniently registers a metric and appends it to the configuration.

Source code in src\xpyrment\core\experiment.py

def __init__(
    self,
    data: pd.DataFrame,
    treatment_col: str,
    id_col: Optional[str] = None,
    covariates: Optional[List[str]] = None,
):
    """Initializes a new Experiment orchestration container.

    Copies the input DataFrame to guarantee immutability of the source dataset during internal
    state transitions and potential data transformations (e.g., CUPED alignment or log scaling).

    Args:
        data (pd.DataFrame): The source DataFrame containing unit-level data.
        treatment_col (str): Name of the column designating experimental groups/arms.
        id_col (Optional[str]): Name of the column containing unique identifiers for each experimental unit.
            Required for certain operations like sequential analysis and user assignments.
        covariates (Optional[List[str]]): List of baseline covariates for balance checking or adjustments.

    Raises:
        ValueError: If `treatment_col` or `id_col` is not found in the input DataFrame columns.
    """
    self.data = data.copy()
    self.treatment_col = treatment_col
    self.id_col = id_col
    self.metrics: List[BaseMetric] = []
    self.covariates: List[str] = covariates or []
    self.metric_registry: Optional[Any] = None
    self.state = ExperimentState.CREATED

    if treatment_col not in self.data.columns:
        raise ValueError(f"Treatment column '{treatment_col}' not found in DataFrame.")
    if id_col and id_col not in self.data.columns:
        raise ValueError(f"ID column '{id_col}' not found in DataFrame.")

transition_to

transition_to(target_state: ExperimentState) -> None

Enforces transition logic to guarantee the phase-gated execution flow.

Uses the ordinal indices of ExperimentState members to verify that the transition is monotonically increasing (forward-only).

Mathematical/Logical Representation: Let $S$ be the ordered tuple of states: $$ S = (\text{CREATED}, \text{PLANNED}, \text{DESIGNED}, \text{RUNNING}, \text{ANALYZED}, \text{REPORTED}) $$ A state transition from state $s_1$ to state $s_2$ is valid if and only if: $$ \text{Index}(s_2) \ge \text{Index}(s_1) $$ with a special exemption permitting $s_1 = \text{ANALYZED} \rightarrow s_2 = \text{ANALYZED}$ to support re-running statistical engines on the locked design data.

PARAMETER	DESCRIPTION
`target_state`	The state the experiment is attempting to transition into. TYPE: `ExperimentState`

RAISES	DESCRIPTION
`PhaseOrderError`	If a backwards state transition is attempted, or if transition is otherwise unauthorized.

Source code in src\xpyrment\core\experiment.py

def transition_to(self, target_state: ExperimentState) -> None:
    r"""Enforces transition logic to guarantee the phase-gated execution flow.

    Uses the ordinal indices of `ExperimentState` members to verify that the transition is
    monotonically increasing (forward-only).

    Mathematical/Logical Representation:
        Let $S$ be the ordered tuple of states:
        $$
        S = (\text{CREATED}, \text{PLANNED}, \text{DESIGNED}, \text{RUNNING}, \text{ANALYZED}, \text{REPORTED})
        $$
        A state transition from state $s_1$ to state $s_2$ is valid if and only if:
        $$
        \text{Index}(s_2) \ge \text{Index}(s_1)
        $$
        with a special exemption permitting $s_1 = \text{ANALYZED} \rightarrow s_2 = \text{ANALYZED}$ to support
        re-running statistical engines on the locked design data.

    Args:
        target_state (ExperimentState): The state the experiment is attempting to transition into.

    Raises:
        PhaseOrderError: If a backwards state transition is attempted, or if transition is otherwise unauthorized.
    """
    current_val = list(ExperimentState).index(self.state)
    target_val = list(ExperimentState).index(target_state)

    # Allow transitioning forward, or re-running analysis
    if target_val < current_val and not (
        self.state == ExperimentState.ANALYZED and target_state == ExperimentState.ANALYZED
    ):
        raise PhaseOrderError(
            f"Cannot transition backwards from {self.state} to {target_state}."
        )

    self.state = target_state

add_metrics

add_metrics(
    metrics: Union[BaseMetric, List[BaseMetric]],
) -> Experiment

Adds statistical metrics to the experiment configuration.

Successfully registering a metric moves the experiment from CREATED to PLANNED state, representing that the evaluation criteria have been defined prior to running designs, validations, or analyses.

PARAMETER	DESCRIPTION
`metrics`	A single metric object or a list of metrics (inheriting from `BaseMetric`) to bind to the experiment lifecycle. TYPE: `Union[BaseMetric, List[BaseMetric]]`

RETURNS	DESCRIPTION
`Experiment`	The experiment instance itself (for fluent API chaining). TYPE: `Experiment`

RAISES	DESCRIPTION
`PhaseOrderError`	If the experiment has already progressed past the `PLANNED` phase. This restriction prevents retrospectively adding metrics to match statistical noise (post-hoc metrics selection/p-hacking).

Source code in src\xpyrment\core\experiment.py

def add_metrics(self, metrics: Union[BaseMetric, List[BaseMetric]]) -> "Experiment":
    """Adds statistical metrics to the experiment configuration.

    Successfully registering a metric moves the experiment from `CREATED` to `PLANNED` state, representing
    that the evaluation criteria have been defined prior to running designs, validations, or analyses.

    Args:
        metrics (Union[BaseMetric, List[BaseMetric]]): A single metric object or a list of metrics
            (inheriting from `BaseMetric`) to bind to the experiment lifecycle.

    Returns:
        Experiment: The experiment instance itself (for fluent API chaining).

    Raises:
        PhaseOrderError: If the experiment has already progressed past the `PLANNED` phase. This restriction
            prevents retrospectively adding metrics to match statistical noise (post-hoc metrics selection/p-hacking).
    """
    if self.state not in [ExperimentState.CREATED, ExperimentState.PLANNED]:
        raise PhaseOrderError(
            f"Cannot add metrics while in state {self.state}. Must be in CREATED or PLANNED."
        )

    if isinstance(metrics, list):
        self.metrics.extend(metrics)
    else:
        self.metrics.append(metrics)

    # Transition the experiment from CREATED to PLANNED if metrics are added
    if self.state == ExperimentState.CREATED:
        self.transition_to(ExperimentState.PLANNED)

    return self

add_covariates

add_covariates(names: Union[str, List[str]]) -> Experiment

Adds baseline covariates to the experiment configuration.

PARAMETER	DESCRIPTION
`names`	A single covariate column name or a list of names. TYPE: `Union[str, List[str]]`

RETURNS	DESCRIPTION
`Experiment`	The experiment instance itself (for fluent API chaining). TYPE: `Experiment`

Source code in src\xpyrment\core\experiment.py

def add_covariates(self, names: Union[str, List[str]]) -> "Experiment":
    """Adds baseline covariates to the experiment configuration.

    Args:
        names (Union[str, List[str]]): A single covariate column name or a list of names.

    Returns:
        Experiment: The experiment instance itself (for fluent API chaining).
    """
    if isinstance(names, list):
        for name in names:
            if name not in self.covariates:
                self.covariates.append(name)
    else:
        if names not in self.covariates:
            self.covariates.append(names)
    return self

register_metric

register_metric(
    name: str,
    metric_type: str = "mean",
    value_col: Optional[str] = None,
    covariate: Optional[str] = None,
    numerator_col: Optional[str] = None,
    denominator_col: Optional[str] = None,
    pre_numerator_col: Optional[str] = None,
    pre_denominator_col: Optional[str] = None,
) -> Experiment

Conveniently registers a metric and appends it to the configuration.

PARAMETER	DESCRIPTION
`name`	Unique descriptive name of the metric. TYPE: `str`
`metric_type`	Type of metric. Options: "mean", "proportion", "ratio". Defaults to "mean". TYPE: `str` DEFAULT: `'mean'`
`value_col`	Column name containing experiment period values. Defaults to the metric name. TYPE: `str` DEFAULT: `None`
`covariate`	Pre-period covariate column name for CUPED. Defaults to None. TYPE: `str` DEFAULT: `None`
`numerator_col`	Column containing numerator values for RatioMetric. TYPE: `str` DEFAULT: `None`
`denominator_col`	Column containing denominator values for RatioMetric. TYPE: `str` DEFAULT: `None`
`pre_numerator_col`	Pre-period numerator column for RatioMetric CUPED. TYPE: `str` DEFAULT: `None`
`pre_denominator_col`	Pre-period denominator column for RatioMetric CUPED. TYPE: `str` DEFAULT: `None`

RETURNS	DESCRIPTION
`Experiment`	The experiment instance itself (for fluent API chaining). TYPE: `Experiment`

Source code in src\xpyrment\core\experiment.py

def register_metric(
    self,
    name: str,
    metric_type: str = "mean",
    value_col: Optional[str] = None,
    covariate: Optional[str] = None,
    numerator_col: Optional[str] = None,
    denominator_col: Optional[str] = None,
    pre_numerator_col: Optional[str] = None,
    pre_denominator_col: Optional[str] = None,
) -> "Experiment":
    """Conveniently registers a metric and appends it to the configuration.

    Args:
        name (str): Unique descriptive name of the metric.
        metric_type (str): Type of metric. Options: "mean", "proportion", "ratio". Defaults to "mean".
        value_col (str, optional): Column name containing experiment period values. Defaults to the metric name.
        covariate (str, optional): Pre-period covariate column name for CUPED. Defaults to None.
        numerator_col (str, optional): Column containing numerator values for RatioMetric.
        denominator_col (str, optional): Column containing denominator values for RatioMetric.
        pre_numerator_col (str, optional): Pre-period numerator column for RatioMetric CUPED.
        pre_denominator_col (str, optional): Pre-period denominator column for RatioMetric CUPED.

    Returns:
        Experiment: The experiment instance itself (for fluent API chaining).
    """
    # Resolve imports lazily to prevent circular imports
    from xpyrment.metrics.taxonomy import MeanMetric, ProportionMetric, RatioMetric

    m_type = metric_type.lower()
    if m_type == "mean":
        col = value_col if value_col is not None else name
        metric = MeanMetric(name, value_col=col, pre_period_col=covariate)
    elif m_type == "proportion":
        col = value_col if value_col is not None else name
        metric = ProportionMetric(name, value_col=col, pre_period_col=covariate)
    elif m_type == "ratio":
        if numerator_col is None or denominator_col is None:
            raise ValueError("Both 'numerator_col' and 'denominator_col' must be specified for ratio metrics.")
        metric = RatioMetric(
            name,
            numerator_col=numerator_col,
            denominator_col=denominator_col,
            pre_numerator_col=pre_numerator_col,
            pre_denominator_col=pre_denominator_col,
        )
    else:
        raise ValueError(f"Unknown metric_type: '{metric_type}'. Expected 'mean', 'proportion', or 'ratio'.")

    return self.add_metrics(metric)

ExperimentRegistry

ExperimentRegistry()

Manages immutable experiment specifications to prevent post-hoc changes (pre-registration).

By calculating a cryptographic SHA-256 signature of serialized, key-sorted experiment specifications, the registry provides an audit trail. Analysts can verify that the running parameters (such as target sample sizes, significance levels, and chosen metrics) precisely match the registered plan, preventing retrospective optimization of analysis parameters.

ATTRIBUTE	DESCRIPTION
`_registry`	Internal store mapping experiment IDs to their registered specification dictionaries and pre-computed hashes. TYPE: `Dict[str, Dict[str, Any]]`

Examples:

Example

>>> registry = ExperimentRegistry()
>>> spec = {"primary_metric": "conversion_rate", "alpha": 0.05, "target_n": 10000}
>>> spec_hash = registry.register_spec("EXP-101", spec)
>>> len(spec_hash)
64
>>> registry.verify_spec("EXP-101", spec)
True
>>> modified_spec = {"primary_metric": "conversion_rate", "alpha": 0.10, "target_n": 10000}
>>> registry.verify_spec("EXP-101", modified_spec)
False

METHOD	DESCRIPTION
`register_spec`	Serializes the experiment specification, hashes it, and stores it in the registry.
`verify_spec`	Verifies if the current spec_dict matches the registered hash to prevent p-hacking.

Source code in src\xpyrment\core\registry.py

def __init__(self):
    """Initializes an empty registry store."""
    self._registry: Dict[str, Dict[str, Any]] = {}

register_spec

register_spec(
    experiment_id: str, spec_dict: Dict[str, Any]
) -> str

Serializes the experiment specification, hashes it, and stores it in the registry.

Ensures that dictionaries are serialized with sorted keys to maintain deterministic hashing across systems, irrespective of key-insertion order.

Mathematical Representation

Let $S$ be the key-sorted, compact JSON serialization of spec_dict encoded in UTF-8. The registered hash $H$ is: $$ H = \text{SHA256}(S) $$

Args: experiment_id (str): Unique identifier of the experiment. spec_dict (Dict[str, Any]): Structural parameters representing the experiment plan, including registered metrics, statistical thresholds ($\alpha, \beta$), and design configurations.

RETURNS	DESCRIPTION
`str`	The hexadecimal representation of the SHA-256 signature hash. TYPE: `str`

Source code in src\xpyrment\core\registry.py

def register_spec(self, experiment_id: str, spec_dict: Dict[str, Any]) -> str:
    r"""Serializes the experiment specification, hashes it, and stores it in the registry.

    Ensures that dictionaries are serialized with sorted keys to maintain deterministic
    hashing across systems, irrespective of key-insertion order.

    Mathematical Representation:
        Let $S$ be the key-sorted, compact JSON serialization of `spec_dict` encoded in UTF-8.
        The registered hash $H$ is:
        $$
        H = \text{SHA256}(S)
        $$
    Args:
        experiment_id (str): Unique identifier of the experiment.
        spec_dict (Dict[str, Any]): Structural parameters representing the experiment plan,
            including registered metrics, statistical thresholds ($\alpha, \beta$), and design configurations.

    Returns:
        str: The hexadecimal representation of the SHA-256 signature hash.
    """
    serialized = json.dumps(spec_dict, sort_keys=True)
    spec_hash = hashlib.sha256(serialized.encode("utf-8")).hexdigest()

    self._registry[experiment_id] = {
        "spec": spec_dict,
        "hash": spec_hash,
    }
    return spec_hash

verify_spec

verify_spec(
    experiment_id: str, spec_dict: Dict[str, Any]
) -> bool

Verifies if the current spec_dict matches the registered hash to prevent p-hacking.

Re-hashes the incoming specification dictionary using key-sorted serialization and performs a constant-time comparison against the stored hash for the given experiment ID.

PARAMETER	DESCRIPTION
`experiment_id`	Registered ID of the experiment to verify. TYPE: `str`
`spec_dict`	The active specification dictionary to validate. TYPE: `Dict[str, Any]`

RETURNS	DESCRIPTION
`bool`	True if the current specification matches the pre-registered specification exactly, False if there is a mismatch or if the experiment ID was never registered. TYPE: `bool`

Source code in src\xpyrment\core\registry.py

def verify_spec(self, experiment_id: str, spec_dict: Dict[str, Any]) -> bool:
    """Verifies if the current spec_dict matches the registered hash to prevent p-hacking.

    Re-hashes the incoming specification dictionary using key-sorted serialization and performs
    a constant-time comparison against the stored hash for the given experiment ID.

    Args:
        experiment_id (str): Registered ID of the experiment to verify.
        spec_dict (Dict[str, Any]): The active specification dictionary to validate.

    Returns:
        bool: True if the current specification matches the pre-registered specification exactly,
            False if there is a mismatch or if the experiment ID was never registered.
    """
    if experiment_id not in self._registry:
        return False

    serialized = json.dumps(spec_dict, sort_keys=True)
    current_hash = hashlib.sha256(serialized.encode("utf-8")).hexdigest()

    return current_hash == self._registry[experiment_id]["hash"]

ExperimentState

Bases: Enum

Enforces the phase-gated state of the experimental lifecycle.

This enum acts as the source of truth for the state machine within Experiment. State transitions are restricted to a forward-only unidirectional progression, preventing common experimental malpractices such as p-hacking, retrospective hypothesis creation, or design tampering.

State Diagram & Authorized Transitions:

stateDiagram-v2
    [*] --> CREATED : Initialization
    CREATED --> PLANNED : add_metrics()
    PLANNED --> DESIGNED : design() / do_doe()
    DESIGNED --> RUNNING : start_experiment()
    RUNNING --> ANALYZED : run_analysis()
    ANALYZED --> ANALYZED : re-run (with frozen data)
    ANALYZED --> REPORTED : compile_report()
    REPORTED --> [*]

States Description

CREATED: The experiment has been instantiated with raw data and a designated treatment column. No configuration or planning has occurred yet.
PLANNED: Hypotheses have been bound, primary and secondary metrics have been assigned, and power calculation (required sample size and Minimum Detectable Effect) has been performed.
DESIGNED: Randomization scheme, traffic split fractions, ramp-up schedules, or classical factorial designs (DoE matrices) have been generated and locked.
RUNNING: The experiment is actively ingesting live assignment and telemetry data. Sequential monitoring (e.g., mSPRT) or early-stopping boundaries are actively checked.
ANALYZED: Ingested data is locked, and statistical inference engines (frequentist, Bayesian, or sequential) have run. CUPED variance reduction and multi-comparison corrections are finalized.
REPORTED: The full lifecycle audit trail, key metrics, and decision recommendations have been serialized into an immutable Experiment Card or exported (JSON/PDF).

Exceptions & Gating: - Any attempt to transition backwards in the sequence (e.g., from RUNNING to PLANNED to add a new metric) will raise a PhaseOrderError. - Re-running analysis in the ANALYZED state to adjust statistical parameters (e.g., alpha, multiple comparison correction method) is authorized without violating state order rules, provided the underlying experimental design remains frozen.

MetricResult

Bases: TypedDict

The canonical data schema representing the output of a statistical metric analysis.

This TypedDict establishes a contract for all inference engines (frequentist, Bayesian, and sequential) and reporting utilities, ensuring that every calculated metric contains both descriptive statistics and rigorous statistical validation metrics.

ATTRIBUTE	DESCRIPTION
`metric_name`	The unique identifier assigned to the analyzed metric. TYPE: `str`
`metric_type`	The standardized type string (e.g., "Mean", "Proportion", "Ratio", "Revenue"). TYPE: `str`
`control_mean`	The sample mean ($\bar{Y}_C$) or proportion ($p_C$) calculated for the control group. TYPE: `float`
`treatment_mean`	The sample mean ($\bar{Y}_T$) or proportion ($p_T$) calculated for the treatment group. TYPE: `float`
`control_var`	The sample variance ($s^2_C$) calculated for the control group. For ratios, this represents the Delta-method approximated variance. TYPE: `float`
`treatment_var`	The sample variance ($s^2_T$) calculated for the treatment group. For ratios, this represents the Delta-method approximated variance. TYPE: `float`
`control_n`	The total count of unique units in the control group ($N_C$). TYPE: `int`
`treatment_n`	The total count of unique units in the treatment group ($N_T$). TYPE: `int`
`absolute_difference`	The point estimate of the absolute treatment effect: $$ \Delta = \bar{Y}_T - \bar{Y}_C $$ TYPE: `float`
`relative_lift`	The percentage increase or decrease of the treatment mean relative to the control mean: $$ \text{Lift} = \frac{\bar{Y}_T - \bar{Y}_C}{\bar{Y}_C} $$ TYPE: `float`
`cuped_applied`	True if Controlled-comparison Using Pre-Existing Data (CUPED) was applied to adjust the variance of this metric. False otherwise. TYPE: `bool`
`variance_reduction`	The percentage reduction in variance achieved by CUPED, bounded in $[0, 1)$: $$ \text{Reduction} = 1 - \frac{\text{Var}(Y_{\text{CUPED}})}{\text{Var}(Y_{\text{original}})} $$ TYPE: `float`
`p_value`	The statistical p-value associated with the hypothesis test. For frequentist, this represents the probability of observing a test statistic at least as extreme as the one computed, under the null hypothesis ($H_0$). TYPE: `float`
`ci_lower`	The lower bound of the absolute confidence/credible interval at the $(1 - \alpha)$ confidence level. TYPE: `float`
`ci_upper`	The upper bound of the absolute confidence/credible interval at the $(1 - \alpha)$ confidence level. TYPE: `float`
`rel_ci_lower`	The lower bound of the relative confidence/credible interval, scaled relative to the control mean: $$ \text{Rel CI Lower} = \frac{\text{CI Lower}}{\bar{Y}_C} $$ TYPE: `float`
`rel_ci_upper`	The upper bound of the relative confidence/credible interval, scaled relative to the control mean: $$ \text{Rel CI Upper} = \frac{\text{CI Upper}}{\bar{Y}_C} $$ TYPE: `float`
`power`	The statistical power ($1 - \beta$) achieved by the sample size, denoting the probability of correctly rejecting the null hypothesis when the true treatment effect equals the observed difference. TYPE: `float`

ExecutionProfiler

ExecutionProfiler(
    stage_name: str, logger: Optional[Logger] = None
)

Context manager and decorator tracking processing stages, execution duration, and peak memory usage.

In production platforms, fine-grained telemetry profile statistics are crucial to detect performance bottlenecks, resource hot spots, and algorithmic memory leaks (especially within bootstrap, MCMC, or massive-scale matrix solvers).

PARAMETER	DESCRIPTION
`stage_name`	Identifier label of the current processing stage (e.g., "bootstrap_resampling"). TYPE: `str`
`logger`	Custom target logger instance. Defaults to None. TYPE: `Optional[Logger]` DEFAULT: `None`

METHOD	DESCRIPTION
`__enter__`	Enters the context boundary, initiating tracemalloc memory tracing and epoch timers.
`__exit__`	Exits the context boundary, stops tracking, records peaks, and logs structured JSON telemetry metrics.
`__call__`	Allows class to function seamlessly as an execution profiler decorator for standard python functions.

Source code in src\xpyrment\core\telemetry.py

def __init__(self, stage_name: str, logger: Optional[logging.Logger] = None):
    """Initializes the ExecutionProfiler.

    Args:
        stage_name (str): Identifier label of the current processing stage (e.g., "bootstrap_resampling").
        logger (Optional[logging.Logger]): Custom target logger instance. Defaults to None.
    """
    self.stage_name = stage_name
    self.logger = logger or get_logger()
    self.start_time: float = 0.0
    self.end_time: float = 0.0
    self.peak_memory_bytes: int = 0

enter

__enter__() -> ExecutionProfiler

Enters the context boundary, initiating tracemalloc memory tracing and epoch timers.

RETURNS	DESCRIPTION
`ExecutionProfiler`	Active instance. TYPE: `ExecutionProfiler`

Source code in src\xpyrment\core\telemetry.py

def __enter__(self) -> "ExecutionProfiler":
    """Enters the context boundary, initiating tracemalloc memory tracing and epoch timers.

    Returns:
        ExecutionProfiler: Active instance.
    """
    if not tracemalloc.is_tracing():
        tracemalloc.start()
    tracemalloc.clear_traces()

    self.start_time = time.perf_counter()
    return self

exit

__exit__(exc_type: Any, exc_val: Any, exc_tb: Any) -> bool

Exits the context boundary, stops tracking, records peaks, and logs structured JSON telemetry metrics.

PARAMETER	DESCRIPTION
`exc_type`	Exception type raised within the context. TYPE: `Any`
`exc_val`	Exception value. TYPE: `Any`
`exc_tb`	Traceback object. TYPE: `Any`

RETURNS	DESCRIPTION
`bool`	False to propagate any exceptions raised in the block. TYPE: `bool`

Source code in src\xpyrment\core\telemetry.py

def __exit__(self, exc_type: Any, exc_val: Any, exc_tb: Any) -> bool:
    """Exits the context boundary, stops tracking, records peaks, and logs structured JSON telemetry metrics.

    Args:
        exc_type (Any): Exception type raised within the context.
        exc_val (Any): Exception value.
        exc_tb (Any): Traceback object.

    Returns:
        bool: False to propagate any exceptions raised in the block.
    """
    self.end_time = time.perf_counter()
    elapsed_seconds = self.end_time - self.start_time

    # Fetch peak memory from tracemalloc
    _, peak = tracemalloc.get_traced_memory()
    self.peak_memory_bytes = peak

    status = "SUCCESS" if exc_type is None else "FAILED"
    profile_metrics = {
        "stage": self.stage_name,
        "duration_seconds": elapsed_seconds,
        "peak_memory_kb": self.peak_memory_bytes / 1024.0,
        "status": status,
    }

    log_msg = f"Profile completed for stage '{self.stage_name}' with status {status}"

    if exc_type is not None:
        profile_metrics["error_type"] = exc_type.__name__
        profile_metrics["error_message"] = str(exc_val)
        self.logger.error(log_msg, extra={"extra_fields": profile_metrics})
    else:
        self.logger.info(log_msg, extra={"extra_fields": profile_metrics})

    return False  # Propagate standard exceptions

call

__call__(func: Any) -> Any

Allows class to function seamlessly as an execution profiler decorator for standard python functions.

PARAMETER	DESCRIPTION
`func`	Callable target function to wrap. TYPE: `Any`

RETURNS	DESCRIPTION
`Any`	Decorated callable function. TYPE: `Any`

Source code in src\xpyrment\core\telemetry.py

def __call__(self, func: Any) -> Any:
    """Allows class to function seamlessly as an execution profiler decorator for standard python functions.

    Args:
        func (Any): Callable target function to wrap.

    Returns:
        Any: Decorated callable function.
    """
    def wrapper(*args, **kwargs):
        with self:
            return func(*args, **kwargs)
    return wrapper

make_serializable

make_serializable(obj: Any) -> Any

Recursively converts numpy and non-serializable objects to native, standard JSON-compliant types.

PARAMETER	DESCRIPTION
`obj`	The nested object or value to convert. TYPE: `Any`

RETURNS	DESCRIPTION
`Any`	Standard Python dictionary, list, float, int, bool, or string. TYPE: `Any`

Source code in src\xpyrment\core\serialization.py

def make_serializable(obj: Any) -> Any:
    """Recursively converts numpy and non-serializable objects to native, standard JSON-compliant types.

    Args:
        obj (Any): The nested object or value to convert.

    Returns:
        Any: Standard Python dictionary, list, float, int, bool, or string.
    """
    import numpy as np

    if isinstance(obj, dict):
        return {str(k): make_serializable(v) for k, v in obj.items()}
    elif isinstance(obj, (list, tuple, set)):
        return [make_serializable(x) for x in obj]
    elif isinstance(obj, np.ndarray):
        return make_serializable(obj.tolist())
    elif isinstance(obj, (np.integer, int)):
        return int(obj)
    elif isinstance(obj, (np.floating, float)):
        val = float(obj)
        if np.isnan(val) or np.isinf(val):
            return str(val)  # String standard for portable audit trails
        return val
    elif isinstance(obj, (np.bool_, bool)):
        return bool(obj)
    elif obj is None:
        return None
    elif hasattr(obj, "to_dict"):
        try:
            return obj.to_dict()
        except Exception:
            return str(obj)
    else:
        try:
            # Handle statsmodels or other custom objects
            if type(obj).__name__ == "ContrastResults" or type(obj).__name__ == "RegressionResultsWrapper":
                return str(obj)
            return str(obj)
        except Exception:
            return None

serialize_to_json

serialize_to_json(
    obj: Any, indent: Optional[int] = None
) -> str

Converts a nested object recursively to serializable format and dumps it as a JSON string.

PARAMETER	DESCRIPTION
`obj`	The object to convert and serialize. TYPE: `Any`
`indent`	Indentation level for pretty-printing. TYPE: `Optional[int]` DEFAULT: `None`

RETURNS	DESCRIPTION
`str`	Validated JSON string. TYPE: `str`

Source code in src\xpyrment\core\serialization.py

def serialize_to_json(obj: Any, indent: Optional[int] = None) -> str:
    """Converts a nested object recursively to serializable format and dumps it as a JSON string.

    Args:
        obj (Any): The object to convert and serialize.
        indent (Optional[int]): Indentation level for pretty-printing.

    Returns:
        str: Validated JSON string.
    """
    return json.dumps(make_serializable(obj), indent=indent)

configure_telemetry

configure_telemetry(
    level: int = INFO, stream: Any = stdout
) -> Logger

Configures and registers the centralized JSON telemetry logger handlers.

PARAMETER	DESCRIPTION
`level`	Log filter level threshold (e.g. logging.INFO). Defaults to logging.INFO. TYPE: `int` DEFAULT: `INFO`
`stream`	Output stream target. Defaults to sys.stdout. TYPE: `Any` DEFAULT: `stdout`

RETURNS	DESCRIPTION
`Logger`	logging.Logger: Configured telemetry Logger instance.

Source code in src\xpyrment\core\telemetry.py

def configure_telemetry(level: int = logging.INFO, stream: Any = sys.stdout) -> logging.Logger:
    """Configures and registers the centralized JSON telemetry logger handlers.

    Args:
        level (int): Log filter level threshold (e.g. logging.INFO). Defaults to logging.INFO.
        stream (Any): Output stream target. Defaults to sys.stdout.

    Returns:
        logging.Logger: Configured telemetry Logger instance.
    """
    logger = logging.getLogger("xpyrment.telemetry")
    logger.setLevel(level)
    logger.propagate = False

    # Remove duplicates
    for handler in list(logger.handlers):
        logger.removeHandler(handler)

    handler = logging.StreamHandler(stream)
    formatter = JSONFormatter()
    handler.setFormatter(formatter)
    logger.addHandler(handler)

    return logger

get_logger

get_logger() -> Logger

Returns the centralized JSON telemetry logger instance.

Ensures the logger is fully configured with default handlers if unconfigured.

RETURNS	DESCRIPTION
`Logger`	logging.Logger: Active telemetry Logger.

Source code in src\xpyrment\core\telemetry.py

def get_logger() -> logging.Logger:
    """Returns the centralized JSON telemetry logger instance.

    Ensures the logger is fully configured with default handlers if unconfigured.

    Returns:
        logging.Logger: Active telemetry Logger.
    """
    logger = logging.getLogger("xpyrment.telemetry")
    if not logger.handlers:
        configure_telemetry()
    return logger

Core Module

core

MetricType module-attribute

PhaseOrderError

SRMError

AliasError

Experiment

transition_to

add_metrics

add_covariates

register_metric

ExperimentRegistry

register_spec

verify_spec

ExperimentState

MetricResult

ExecutionProfiler

__enter__

__exit__

__call__

make_serializable

serialize_to_json

configure_telemetry

get_logger

MetricType `module-attribute`

enter

exit

call