Skip to content

Core Module

The xpyrment.core module contains submodules and components for core.

core

Core engine abstractions, state management, exception classes, and shared types.

This package provides the foundational structural mechanisms for the xpyrment package: - Experiment: The central orchestration state container that governs execution. - ExperimentState: The rigid phase-gating mechanism (CREATED -> PLANNED -> DESIGNED -> RUNNING -> ANALYZED -> REPORTED). - ExperimentRegistry: Cryptographic hashing and pre-registration validator to prevent post-hoc changes. - Custom Exceptions: Robust, informative error feedback to protect experimental integrity (PhaseOrderError, SRMError, AliasError). - Strict Typing schemas: Standardized TypedDict representation (MetricResult) of calculation outputs.

MODULE DESCRIPTION
exceptions

Custom exception classes for the xpyrment core system.

experiment

Central orchestrator for experiment setup, configuration, and phase management.

registry

Preregistration registry for locking and verifying experiment specifications.

serialization

Robust serialization utilities to guarantee native Python and JSON compatibility (Block 52).

state

State machine and phase-gating representation for experiment lifecycles.

telemetry

Centralized Telemetry, Structured JSON Logging, and Execution Profiling (Block 56).

types

Core type definitions, TypeDicts, and Literals for the xpyrment library.

CLASS DESCRIPTION
PhaseOrderError

Raised when an operation is performed in an invalid state/phase.

SRMError

Raised when a Sample Ratio Mismatch (SRM) is detected during validation.

AliasError

Raised when fractional factorial alias confounding is violated or misconfigured.

Experiment

The central orchestration class for setting up, configuring, and executing experiments.

ExperimentRegistry

Manages immutable experiment specifications to prevent post-hoc changes (pre-registration).

ExperimentState

Enforces the phase-gated state of the experimental lifecycle.

MetricResult

The canonical data schema representing the output of a statistical metric analysis.

ExecutionProfiler

Context manager and decorator tracking processing stages, execution duration, and peak memory usage.

FUNCTION DESCRIPTION
make_serializable

Recursively converts numpy and non-serializable objects to native, standard JSON-compliant types.

serialize_to_json

Converts a nested object recursively to serializable format and dumps it as a JSON string.

configure_telemetry

Configures and registers the centralized JSON telemetry logger handlers.

get_logger

Returns the centralized JSON telemetry logger instance.

ATTRIBUTE DESCRIPTION
MetricType

Literal representing the supported category of metrics.

MetricType module-attribute

MetricType = Literal[
    "mean", "proportion", "ratio", "revenue"
]

Literal representing the supported category of metrics.

Supported Types
  • "mean": A continuous or discrete numeric metric where statistics are calculated on a per-unit basis (e.g., average sessions per user, average page views).
  • "proportion": A binary rate metric representing yes/no outcomes on a per-unit basis, equivalent to a Bernoulli trial (e.g., conversion rate, click-through-rate where the unit of analysis is the user).
  • "ratio": An aggregated metric computed as the sum of a numerator divided by the sum of a denominator across all units (e.g., global Click-Through-Rate = total clicks / total impressions). Requires Delta Method for proper variance approximation.
  • "revenue": A highly skewed continuous monetary metric (e.g., revenue per user, average order value). Often subject to log-transformations or specialized variance reduction.

PhaseOrderError

Bases: Exception

Raised when an operation is performed in an invalid state/phase.

This exception is a core mechanism of the phase-gated execution flow. It prevents: - Downstream actions (e.g., calling .analyze() or .report()) from being executed before upstream requirements (e.g., .design() or .validate()) are complete. - Upstream state reversals (e.g., transitioning back to CREATED or PLANNED once an experiment is already RUNNING or ANALYZED), which could lead to post-hoc configuration tampering or invalid statistical analysis.

Mathematical & Operational Context: The experimental lifecycle is governed by a strict directed, non-cyclic graph of transitions: CREATED -> PLANNED -> DESIGNED -> RUNNING -> ANALYZED -> REPORTED. Any transition where index(target_state) < index(current_state) violates this unidirectional flow and triggers this error (with the exception of re-running the analysis on the same or new frozen data, which remains in the ANALYZED state).

ATTRIBUTE DESCRIPTION
message

Explains the invalid state transition attempt and the active state.

TYPE: str

Examples:

Example
>>> from xpyrment.core.state import ExperimentState
>>> from xpyrment.core.exceptions import PhaseOrderError
>>> raise PhaseOrderError("Cannot transition backwards from RUNNING to PLANNED.")

SRMError

Bases: Exception

Raised when a Sample Ratio Mismatch (SRM) is detected during validation.

SRM occurs when the observed ratio of sample counts assigned to treatment arms significantly deviates from the pre-specified expected ratio (e.g., 50/50 split). An SRM is a critical indicator of data quality issues, selection bias, or bugs in the randomization/assignment mechanism.

Mathematical Background

A Pearson Chi-Square Goodness-of-Fit test is performed to evaluate the discrepancy: $$ \chi^2 = \sum_{i=1}^{k} \frac{(O_i - E_i)^2}{E_i} $$ where \(O_i\) is the observed count in arm \(i\) and \(E_i\) is the expected count under the planned split. The degrees of freedom is \(k - 1\). This exception is raised if the resulting p-value is extremely small (typically p < 0.001), indicating that the deviation is highly unlikely to have occurred by chance alone.

Remediation Procedures
  1. Halt the experiment or analysis immediately.
  2. Audit the randomization and unit assignment pipeline for bugs.
  3. Check for data loss, telemetry issues, or delay in event ingestion pipelines.
  4. Validate that the assignment tracking log captures all users on first touch.
ATTRIBUTE DESCRIPTION
message

Explains the observed vs. expected sample counts and the p-value.

TYPE: str

AliasError

Bases: Exception

Raised when fractional factorial alias confounding is violated or misconfigured.

In fractional factorial classical Design of Experiments (DoE), only a fraction of all possible factor combinations is run. This results in confounding (aliasing), where the estimate of a specific main effect is mathematically indistinguishable from a multi-factor interaction term.

This exception is raised when: - A user tries to estimate an effect that is completely confounded with another active effect of equal or lower order (violating the specified design Resolution). - The defined alias structure does not match the actual combinations present in the design matrix. - The design resolution (III, IV, or V) is insufficient to support the hypothesis or interaction analysis requested.

Mathematical Context

Let \(X\) be the design matrix and \(C = (X^T X)^{-1} X^T Y\) be the parameter estimates. If the design is fractional, some columns of \(X\) are linear combinations of others, leading to a rank-deficient matrix where unique solutions for all factors and interactions do not exist. The alias relation matrix \(A\) defines which terms are confounded: $$ E[\hat{\beta}_1] = \beta_1 + A \beta_2 $$ An AliasError prevents the system from proceeding with invalid or unresolvable confounding structures.

ATTRIBUTE DESCRIPTION
message

Details the confounded factors or resolution constraint violated.

TYPE: str

Experiment

Experiment(
    data: DataFrame,
    treatment_col: str,
    id_col: Optional[str] = None,
    covariates: Optional[List[str]] = None,
)

The central orchestration class for setting up, configuring, and executing experiments.

The Experiment class binds the experimental dataset, defines treatment structures, maps the metric taxonomy, and strictly enforces state transitions across the execution lifecycle. Through the state-machine rules, it ensures that all calculations are performed sequentially and reproducibly, eliminating retrospective tampering or incorrect state usage.

ATTRIBUTE DESCRIPTION
data

A copy of the input DataFrame containing assignments and telemetry.

TYPE: DataFrame

treatment_col

The column in data identifying the treatment arm assignments.

TYPE: str

id_col

The column in data representing unique unit IDs.

TYPE: Optional[str]

metrics

List of metrics registered for statistical calculation.

TYPE: List[BaseMetric]

state

The current lifecycle phase of the experiment.

TYPE: ExperimentState

State Gating Mechanism

Execution functions across downstream submodules verify that the experiment is in the appropriate state before proceeding. For example, running power analysis transitions the state from CREATED to PLANNED. Running randomization moves from PLANNED to DESIGNED. Analyzing results requires a transition to ANALYZED.

Examples:

Example
>>> import pandas as pd
>>> from xpyrment import Experiment
>>> from xpyrment.metrics.taxonomy import MeanMetric
>>> df = pd.DataFrame({"user_id": [1, 2, 3], "group": ["control", "treatment", "control"], "revenue": [10.5, 12.0, 9.5]})
>>> exp = Experiment(df, treatment_col="group", id_col="user_id")
>>> exp.state
<ExperimentState.CREATED: 'CREATED'>
>>> metric = MeanMetric("Revenue Metric", value_col="revenue")
>>> exp.add_metrics(metric)
>>> exp.state
<ExperimentState.PLANNED: 'PLANNED'>

Copies the input DataFrame to guarantee immutability of the source dataset during internal state transitions and potential data transformations (e.g., CUPED alignment or log scaling).

PARAMETER DESCRIPTION
data

The source DataFrame containing unit-level data.

TYPE: DataFrame

treatment_col

Name of the column designating experimental groups/arms.

TYPE: str

id_col

Name of the column containing unique identifiers for each experimental unit. Required for certain operations like sequential analysis and user assignments.

TYPE: Optional[str] DEFAULT: None

covariates

List of baseline covariates for balance checking or adjustments.

TYPE: Optional[List[str]] DEFAULT: None

RAISES DESCRIPTION
ValueError

If treatment_col or id_col is not found in the input DataFrame columns.

METHOD DESCRIPTION
transition_to

Enforces transition logic to guarantee the phase-gated execution flow.

add_metrics

Adds statistical metrics to the experiment configuration.

add_covariates

Adds baseline covariates to the experiment configuration.

register_metric

Conveniently registers a metric and appends it to the configuration.

Source code in src\xpyrment\core\experiment.py
def __init__(
    self,
    data: pd.DataFrame,
    treatment_col: str,
    id_col: Optional[str] = None,
    covariates: Optional[List[str]] = None,
):
    """Initializes a new Experiment orchestration container.

    Copies the input DataFrame to guarantee immutability of the source dataset during internal
    state transitions and potential data transformations (e.g., CUPED alignment or log scaling).

    Args:
        data (pd.DataFrame): The source DataFrame containing unit-level data.
        treatment_col (str): Name of the column designating experimental groups/arms.
        id_col (Optional[str]): Name of the column containing unique identifiers for each experimental unit.
            Required for certain operations like sequential analysis and user assignments.
        covariates (Optional[List[str]]): List of baseline covariates for balance checking or adjustments.

    Raises:
        ValueError: If `treatment_col` or `id_col` is not found in the input DataFrame columns.
    """
    self.data = data.copy()
    self.treatment_col = treatment_col
    self.id_col = id_col
    self.metrics: List[BaseMetric] = []
    self.covariates: List[str] = covariates or []
    self.metric_registry: Optional[Any] = None
    self.state = ExperimentState.CREATED

    if treatment_col not in self.data.columns:
        raise ValueError(f"Treatment column '{treatment_col}' not found in DataFrame.")
    if id_col and id_col not in self.data.columns:
        raise ValueError(f"ID column '{id_col}' not found in DataFrame.")

transition_to

transition_to(target_state: ExperimentState) -> None

Enforces transition logic to guarantee the phase-gated execution flow.

Uses the ordinal indices of ExperimentState members to verify that the transition is monotonically increasing (forward-only).

Mathematical/Logical Representation: Let \(S\) be the ordered tuple of states: $$ S = (\text{CREATED}, \text{PLANNED}, \text{DESIGNED}, \text{RUNNING}, \text{ANALYZED}, \text{REPORTED}) $$ A state transition from state \(s_1\) to state \(s_2\) is valid if and only if: $$ \text{Index}(s_2) \ge \text{Index}(s_1) $$ with a special exemption permitting \(s_1 = \text{ANALYZED} \rightarrow s_2 = \text{ANALYZED}\) to support re-running statistical engines on the locked design data.

PARAMETER DESCRIPTION
target_state

The state the experiment is attempting to transition into.

TYPE: ExperimentState

RAISES DESCRIPTION
PhaseOrderError

If a backwards state transition is attempted, or if transition is otherwise unauthorized.

Source code in src\xpyrment\core\experiment.py
def transition_to(self, target_state: ExperimentState) -> None:
    r"""Enforces transition logic to guarantee the phase-gated execution flow.

    Uses the ordinal indices of `ExperimentState` members to verify that the transition is
    monotonically increasing (forward-only).

    Mathematical/Logical Representation:
        Let $S$ be the ordered tuple of states:
        $$
        S = (\text{CREATED}, \text{PLANNED}, \text{DESIGNED}, \text{RUNNING}, \text{ANALYZED}, \text{REPORTED})
        $$
        A state transition from state $s_1$ to state $s_2$ is valid if and only if:
        $$
        \text{Index}(s_2) \ge \text{Index}(s_1)
        $$
        with a special exemption permitting $s_1 = \text{ANALYZED} \rightarrow s_2 = \text{ANALYZED}$ to support
        re-running statistical engines on the locked design data.

    Args:
        target_state (ExperimentState): The state the experiment is attempting to transition into.

    Raises:
        PhaseOrderError: If a backwards state transition is attempted, or if transition is otherwise unauthorized.
    """
    current_val = list(ExperimentState).index(self.state)
    target_val = list(ExperimentState).index(target_state)

    # Allow transitioning forward, or re-running analysis
    if target_val < current_val and not (
        self.state == ExperimentState.ANALYZED and target_state == ExperimentState.ANALYZED
    ):
        raise PhaseOrderError(
            f"Cannot transition backwards from {self.state} to {target_state}."
        )

    self.state = target_state

add_metrics

add_metrics(
    metrics: Union[BaseMetric, List[BaseMetric]],
) -> Experiment

Adds statistical metrics to the experiment configuration.

Successfully registering a metric moves the experiment from CREATED to PLANNED state, representing that the evaluation criteria have been defined prior to running designs, validations, or analyses.

PARAMETER DESCRIPTION
metrics

A single metric object or a list of metrics (inheriting from BaseMetric) to bind to the experiment lifecycle.

TYPE: Union[BaseMetric, List[BaseMetric]]

RETURNS DESCRIPTION
Experiment

The experiment instance itself (for fluent API chaining).

TYPE: Experiment

RAISES DESCRIPTION
PhaseOrderError

If the experiment has already progressed past the PLANNED phase. This restriction prevents retrospectively adding metrics to match statistical noise (post-hoc metrics selection/p-hacking).

Source code in src\xpyrment\core\experiment.py
def add_metrics(self, metrics: Union[BaseMetric, List[BaseMetric]]) -> "Experiment":
    """Adds statistical metrics to the experiment configuration.

    Successfully registering a metric moves the experiment from `CREATED` to `PLANNED` state, representing
    that the evaluation criteria have been defined prior to running designs, validations, or analyses.

    Args:
        metrics (Union[BaseMetric, List[BaseMetric]]): A single metric object or a list of metrics
            (inheriting from `BaseMetric`) to bind to the experiment lifecycle.

    Returns:
        Experiment: The experiment instance itself (for fluent API chaining).

    Raises:
        PhaseOrderError: If the experiment has already progressed past the `PLANNED` phase. This restriction
            prevents retrospectively adding metrics to match statistical noise (post-hoc metrics selection/p-hacking).
    """
    if self.state not in [ExperimentState.CREATED, ExperimentState.PLANNED]:
        raise PhaseOrderError(
            f"Cannot add metrics while in state {self.state}. Must be in CREATED or PLANNED."
        )

    if isinstance(metrics, list):
        self.metrics.extend(metrics)
    else:
        self.metrics.append(metrics)

    # Transition the experiment from CREATED to PLANNED if metrics are added
    if self.state == ExperimentState.CREATED:
        self.transition_to(ExperimentState.PLANNED)

    return self

add_covariates

add_covariates(names: Union[str, List[str]]) -> Experiment

Adds baseline covariates to the experiment configuration.

PARAMETER DESCRIPTION
names

A single covariate column name or a list of names.

TYPE: Union[str, List[str]]

RETURNS DESCRIPTION
Experiment

The experiment instance itself (for fluent API chaining).

TYPE: Experiment

Source code in src\xpyrment\core\experiment.py
def add_covariates(self, names: Union[str, List[str]]) -> "Experiment":
    """Adds baseline covariates to the experiment configuration.

    Args:
        names (Union[str, List[str]]): A single covariate column name or a list of names.

    Returns:
        Experiment: The experiment instance itself (for fluent API chaining).
    """
    if isinstance(names, list):
        for name in names:
            if name not in self.covariates:
                self.covariates.append(name)
    else:
        if names not in self.covariates:
            self.covariates.append(names)
    return self

register_metric

register_metric(
    name: str,
    metric_type: str = "mean",
    value_col: Optional[str] = None,
    covariate: Optional[str] = None,
    numerator_col: Optional[str] = None,
    denominator_col: Optional[str] = None,
    pre_numerator_col: Optional[str] = None,
    pre_denominator_col: Optional[str] = None,
) -> Experiment

Conveniently registers a metric and appends it to the configuration.

PARAMETER DESCRIPTION
name

Unique descriptive name of the metric.

TYPE: str

metric_type

Type of metric. Options: "mean", "proportion", "ratio". Defaults to "mean".

TYPE: str DEFAULT: 'mean'

value_col

Column name containing experiment period values. Defaults to the metric name.

TYPE: str DEFAULT: None

covariate

Pre-period covariate column name for CUPED. Defaults to None.

TYPE: str DEFAULT: None

numerator_col

Column containing numerator values for RatioMetric.

TYPE: str DEFAULT: None

denominator_col

Column containing denominator values for RatioMetric.

TYPE: str DEFAULT: None

pre_numerator_col

Pre-period numerator column for RatioMetric CUPED.

TYPE: str DEFAULT: None

pre_denominator_col

Pre-period denominator column for RatioMetric CUPED.

TYPE: str DEFAULT: None

RETURNS DESCRIPTION
Experiment

The experiment instance itself (for fluent API chaining).

TYPE: Experiment

Source code in src\xpyrment\core\experiment.py
def register_metric(
    self,
    name: str,
    metric_type: str = "mean",
    value_col: Optional[str] = None,
    covariate: Optional[str] = None,
    numerator_col: Optional[str] = None,
    denominator_col: Optional[str] = None,
    pre_numerator_col: Optional[str] = None,
    pre_denominator_col: Optional[str] = None,
) -> "Experiment":
    """Conveniently registers a metric and appends it to the configuration.

    Args:
        name (str): Unique descriptive name of the metric.
        metric_type (str): Type of metric. Options: "mean", "proportion", "ratio". Defaults to "mean".
        value_col (str, optional): Column name containing experiment period values. Defaults to the metric name.
        covariate (str, optional): Pre-period covariate column name for CUPED. Defaults to None.
        numerator_col (str, optional): Column containing numerator values for RatioMetric.
        denominator_col (str, optional): Column containing denominator values for RatioMetric.
        pre_numerator_col (str, optional): Pre-period numerator column for RatioMetric CUPED.
        pre_denominator_col (str, optional): Pre-period denominator column for RatioMetric CUPED.

    Returns:
        Experiment: The experiment instance itself (for fluent API chaining).
    """
    # Resolve imports lazily to prevent circular imports
    from xpyrment.metrics.taxonomy import MeanMetric, ProportionMetric, RatioMetric

    m_type = metric_type.lower()
    if m_type == "mean":
        col = value_col if value_col is not None else name
        metric = MeanMetric(name, value_col=col, pre_period_col=covariate)
    elif m_type == "proportion":
        col = value_col if value_col is not None else name
        metric = ProportionMetric(name, value_col=col, pre_period_col=covariate)
    elif m_type == "ratio":
        if numerator_col is None or denominator_col is None:
            raise ValueError("Both 'numerator_col' and 'denominator_col' must be specified for ratio metrics.")
        metric = RatioMetric(
            name,
            numerator_col=numerator_col,
            denominator_col=denominator_col,
            pre_numerator_col=pre_numerator_col,
            pre_denominator_col=pre_denominator_col,
        )
    else:
        raise ValueError(f"Unknown metric_type: '{metric_type}'. Expected 'mean', 'proportion', or 'ratio'.")

    return self.add_metrics(metric)

ExperimentRegistry

ExperimentRegistry()

Manages immutable experiment specifications to prevent post-hoc changes (pre-registration).

By calculating a cryptographic SHA-256 signature of serialized, key-sorted experiment specifications, the registry provides an audit trail. Analysts can verify that the running parameters (such as target sample sizes, significance levels, and chosen metrics) precisely match the registered plan, preventing retrospective optimization of analysis parameters.

ATTRIBUTE DESCRIPTION
_registry

Internal store mapping experiment IDs to their registered specification dictionaries and pre-computed hashes.

TYPE: Dict[str, Dict[str, Any]]

Examples:

Example
>>> registry = ExperimentRegistry()
>>> spec = {"primary_metric": "conversion_rate", "alpha": 0.05, "target_n": 10000}
>>> spec_hash = registry.register_spec("EXP-101", spec)
>>> len(spec_hash)
64
>>> registry.verify_spec("EXP-101", spec)
True
>>> modified_spec = {"primary_metric": "conversion_rate", "alpha": 0.10, "target_n": 10000}
>>> registry.verify_spec("EXP-101", modified_spec)
False
METHOD DESCRIPTION
register_spec

Serializes the experiment specification, hashes it, and stores it in the registry.

verify_spec

Verifies if the current spec_dict matches the registered hash to prevent p-hacking.

Source code in src\xpyrment\core\registry.py
def __init__(self):
    """Initializes an empty registry store."""
    self._registry: Dict[str, Dict[str, Any]] = {}

register_spec

register_spec(
    experiment_id: str, spec_dict: Dict[str, Any]
) -> str

Serializes the experiment specification, hashes it, and stores it in the registry.

Ensures that dictionaries are serialized with sorted keys to maintain deterministic hashing across systems, irrespective of key-insertion order.

Mathematical Representation

Let \(S\) be the key-sorted, compact JSON serialization of spec_dict encoded in UTF-8. The registered hash \(H\) is: $$ H = \text{SHA256}(S) $$

Args: experiment_id (str): Unique identifier of the experiment. spec_dict (Dict[str, Any]): Structural parameters representing the experiment plan, including registered metrics, statistical thresholds (\(\alpha, \beta\)), and design configurations.

RETURNS DESCRIPTION
str

The hexadecimal representation of the SHA-256 signature hash.

TYPE: str

Source code in src\xpyrment\core\registry.py
def register_spec(self, experiment_id: str, spec_dict: Dict[str, Any]) -> str:
    r"""Serializes the experiment specification, hashes it, and stores it in the registry.

    Ensures that dictionaries are serialized with sorted keys to maintain deterministic
    hashing across systems, irrespective of key-insertion order.

    Mathematical Representation:
        Let $S$ be the key-sorted, compact JSON serialization of `spec_dict` encoded in UTF-8.
        The registered hash $H$ is:
        $$
        H = \text{SHA256}(S)
        $$
    Args:
        experiment_id (str): Unique identifier of the experiment.
        spec_dict (Dict[str, Any]): Structural parameters representing the experiment plan,
            including registered metrics, statistical thresholds ($\alpha, \beta$), and design configurations.

    Returns:
        str: The hexadecimal representation of the SHA-256 signature hash.
    """
    serialized = json.dumps(spec_dict, sort_keys=True)
    spec_hash = hashlib.sha256(serialized.encode("utf-8")).hexdigest()

    self._registry[experiment_id] = {
        "spec": spec_dict,
        "hash": spec_hash,
    }
    return spec_hash

verify_spec

verify_spec(
    experiment_id: str, spec_dict: Dict[str, Any]
) -> bool

Verifies if the current spec_dict matches the registered hash to prevent p-hacking.

Re-hashes the incoming specification dictionary using key-sorted serialization and performs a constant-time comparison against the stored hash for the given experiment ID.

PARAMETER DESCRIPTION
experiment_id

Registered ID of the experiment to verify.

TYPE: str

spec_dict

The active specification dictionary to validate.

TYPE: Dict[str, Any]

RETURNS DESCRIPTION
bool

True if the current specification matches the pre-registered specification exactly, False if there is a mismatch or if the experiment ID was never registered.

TYPE: bool

Source code in src\xpyrment\core\registry.py
def verify_spec(self, experiment_id: str, spec_dict: Dict[str, Any]) -> bool:
    """Verifies if the current spec_dict matches the registered hash to prevent p-hacking.

    Re-hashes the incoming specification dictionary using key-sorted serialization and performs
    a constant-time comparison against the stored hash for the given experiment ID.

    Args:
        experiment_id (str): Registered ID of the experiment to verify.
        spec_dict (Dict[str, Any]): The active specification dictionary to validate.

    Returns:
        bool: True if the current specification matches the pre-registered specification exactly,
            False if there is a mismatch or if the experiment ID was never registered.
    """
    if experiment_id not in self._registry:
        return False

    serialized = json.dumps(spec_dict, sort_keys=True)
    current_hash = hashlib.sha256(serialized.encode("utf-8")).hexdigest()

    return current_hash == self._registry[experiment_id]["hash"]

ExperimentState

Bases: Enum

Enforces the phase-gated state of the experimental lifecycle.

This enum acts as the source of truth for the state machine within Experiment. State transitions are restricted to a forward-only unidirectional progression, preventing common experimental malpractices such as p-hacking, retrospective hypothesis creation, or design tampering.

State Diagram & Authorized Transitions:

stateDiagram-v2
    [*] --> CREATED : Initialization
    CREATED --> PLANNED : add_metrics()
    PLANNED --> DESIGNED : design() / do_doe()
    DESIGNED --> RUNNING : start_experiment()
    RUNNING --> ANALYZED : run_analysis()
    ANALYZED --> ANALYZED : re-run (with frozen data)
    ANALYZED --> REPORTED : compile_report()
    REPORTED --> [*]

States Description
  • CREATED: The experiment has been instantiated with raw data and a designated treatment column. No configuration or planning has occurred yet.
  • PLANNED: Hypotheses have been bound, primary and secondary metrics have been assigned, and power calculation (required sample size and Minimum Detectable Effect) has been performed.
  • DESIGNED: Randomization scheme, traffic split fractions, ramp-up schedules, or classical factorial designs (DoE matrices) have been generated and locked.
  • RUNNING: The experiment is actively ingesting live assignment and telemetry data. Sequential monitoring (e.g., mSPRT) or early-stopping boundaries are actively checked.
  • ANALYZED: Ingested data is locked, and statistical inference engines (frequentist, Bayesian, or sequential) have run. CUPED variance reduction and multi-comparison corrections are finalized.
  • REPORTED: The full lifecycle audit trail, key metrics, and decision recommendations have been serialized into an immutable Experiment Card or exported (JSON/PDF).

Exceptions & Gating: - Any attempt to transition backwards in the sequence (e.g., from RUNNING to PLANNED to add a new metric) will raise a PhaseOrderError. - Re-running analysis in the ANALYZED state to adjust statistical parameters (e.g., alpha, multiple comparison correction method) is authorized without violating state order rules, provided the underlying experimental design remains frozen.

MetricResult

Bases: TypedDict

The canonical data schema representing the output of a statistical metric analysis.

This TypedDict establishes a contract for all inference engines (frequentist, Bayesian, and sequential) and reporting utilities, ensuring that every calculated metric contains both descriptive statistics and rigorous statistical validation metrics.

ATTRIBUTE DESCRIPTION
metric_name

The unique identifier assigned to the analyzed metric.

TYPE: str

metric_type

The standardized type string (e.g., "Mean", "Proportion", "Ratio", "Revenue").

TYPE: str

control_mean

The sample mean (\(\bar{Y}_C\)) or proportion (\(p_C\)) calculated for the control group.

TYPE: float

treatment_mean

The sample mean (\(\bar{Y}_T\)) or proportion (\(p_T\)) calculated for the treatment group.

TYPE: float

control_var

The sample variance (\(s^2_C\)) calculated for the control group. For ratios, this represents the Delta-method approximated variance.

TYPE: float

treatment_var

The sample variance (\(s^2_T\)) calculated for the treatment group. For ratios, this represents the Delta-method approximated variance.

TYPE: float

control_n

The total count of unique units in the control group (\(N_C\)).

TYPE: int

treatment_n

The total count of unique units in the treatment group (\(N_T\)).

TYPE: int

absolute_difference

The point estimate of the absolute treatment effect: $$ \Delta = \bar{Y}_T - \bar{Y}_C $$

TYPE: float

relative_lift

The percentage increase or decrease of the treatment mean relative to the control mean: $$ \text{Lift} = \frac{\bar{Y}_T - \bar{Y}_C}{\bar{Y}_C} $$

TYPE: float

cuped_applied

True if Controlled-comparison Using Pre-Existing Data (CUPED) was applied to adjust the variance of this metric. False otherwise.

TYPE: bool

variance_reduction

The percentage reduction in variance achieved by CUPED, bounded in \([0, 1)\): $$ \text{Reduction} = 1 - \frac{\text{Var}(Y_{\text{CUPED}})}{\text{Var}(Y_{\text{original}})} $$

TYPE: float

p_value

The statistical p-value associated with the hypothesis test. For frequentist, this represents the probability of observing a test statistic at least as extreme as the one computed, under the null hypothesis (\(H_0\)).

TYPE: float

ci_lower

The lower bound of the absolute confidence/credible interval at the \((1 - \alpha)\) confidence level.

TYPE: float

ci_upper

The upper bound of the absolute confidence/credible interval at the \((1 - \alpha)\) confidence level.

TYPE: float

rel_ci_lower

The lower bound of the relative confidence/credible interval, scaled relative to the control mean: $$ \text{Rel CI Lower} = \frac{\text{CI Lower}}{\bar{Y}_C} $$

TYPE: float

rel_ci_upper

The upper bound of the relative confidence/credible interval, scaled relative to the control mean: $$ \text{Rel CI Upper} = \frac{\text{CI Upper}}{\bar{Y}_C} $$

TYPE: float

power

The statistical power (\(1 - \beta\)) achieved by the sample size, denoting the probability of correctly rejecting the null hypothesis when the true treatment effect equals the observed difference.

TYPE: float

ExecutionProfiler

ExecutionProfiler(
    stage_name: str, logger: Optional[Logger] = None
)

Context manager and decorator tracking processing stages, execution duration, and peak memory usage.

In production platforms, fine-grained telemetry profile statistics are crucial to detect performance bottlenecks, resource hot spots, and algorithmic memory leaks (especially within bootstrap, MCMC, or massive-scale matrix solvers).

PARAMETER DESCRIPTION
stage_name

Identifier label of the current processing stage (e.g., "bootstrap_resampling").

TYPE: str

logger

Custom target logger instance. Defaults to None.

TYPE: Optional[Logger] DEFAULT: None

METHOD DESCRIPTION
__enter__

Enters the context boundary, initiating tracemalloc memory tracing and epoch timers.

__exit__

Exits the context boundary, stops tracking, records peaks, and logs structured JSON telemetry metrics.

__call__

Allows class to function seamlessly as an execution profiler decorator for standard python functions.

Source code in src\xpyrment\core\telemetry.py
def __init__(self, stage_name: str, logger: Optional[logging.Logger] = None):
    """Initializes the ExecutionProfiler.

    Args:
        stage_name (str): Identifier label of the current processing stage (e.g., "bootstrap_resampling").
        logger (Optional[logging.Logger]): Custom target logger instance. Defaults to None.
    """
    self.stage_name = stage_name
    self.logger = logger or get_logger()
    self.start_time: float = 0.0
    self.end_time: float = 0.0
    self.peak_memory_bytes: int = 0

__enter__

__enter__() -> ExecutionProfiler

Enters the context boundary, initiating tracemalloc memory tracing and epoch timers.

RETURNS DESCRIPTION
ExecutionProfiler

Active instance.

TYPE: ExecutionProfiler

Source code in src\xpyrment\core\telemetry.py
def __enter__(self) -> "ExecutionProfiler":
    """Enters the context boundary, initiating tracemalloc memory tracing and epoch timers.

    Returns:
        ExecutionProfiler: Active instance.
    """
    if not tracemalloc.is_tracing():
        tracemalloc.start()
    tracemalloc.clear_traces()

    self.start_time = time.perf_counter()
    return self

__exit__

__exit__(exc_type: Any, exc_val: Any, exc_tb: Any) -> bool

Exits the context boundary, stops tracking, records peaks, and logs structured JSON telemetry metrics.

PARAMETER DESCRIPTION
exc_type

Exception type raised within the context.

TYPE: Any

exc_val

Exception value.

TYPE: Any

exc_tb

Traceback object.

TYPE: Any

RETURNS DESCRIPTION
bool

False to propagate any exceptions raised in the block.

TYPE: bool

Source code in src\xpyrment\core\telemetry.py
def __exit__(self, exc_type: Any, exc_val: Any, exc_tb: Any) -> bool:
    """Exits the context boundary, stops tracking, records peaks, and logs structured JSON telemetry metrics.

    Args:
        exc_type (Any): Exception type raised within the context.
        exc_val (Any): Exception value.
        exc_tb (Any): Traceback object.

    Returns:
        bool: False to propagate any exceptions raised in the block.
    """
    self.end_time = time.perf_counter()
    elapsed_seconds = self.end_time - self.start_time

    # Fetch peak memory from tracemalloc
    _, peak = tracemalloc.get_traced_memory()
    self.peak_memory_bytes = peak

    status = "SUCCESS" if exc_type is None else "FAILED"
    profile_metrics = {
        "stage": self.stage_name,
        "duration_seconds": elapsed_seconds,
        "peak_memory_kb": self.peak_memory_bytes / 1024.0,
        "status": status,
    }

    log_msg = f"Profile completed for stage '{self.stage_name}' with status {status}"

    if exc_type is not None:
        profile_metrics["error_type"] = exc_type.__name__
        profile_metrics["error_message"] = str(exc_val)
        self.logger.error(log_msg, extra={"extra_fields": profile_metrics})
    else:
        self.logger.info(log_msg, extra={"extra_fields": profile_metrics})

    return False  # Propagate standard exceptions

__call__

__call__(func: Any) -> Any

Allows class to function seamlessly as an execution profiler decorator for standard python functions.

PARAMETER DESCRIPTION
func

Callable target function to wrap.

TYPE: Any

RETURNS DESCRIPTION
Any

Decorated callable function.

TYPE: Any

Source code in src\xpyrment\core\telemetry.py
def __call__(self, func: Any) -> Any:
    """Allows class to function seamlessly as an execution profiler decorator for standard python functions.

    Args:
        func (Any): Callable target function to wrap.

    Returns:
        Any: Decorated callable function.
    """
    def wrapper(*args, **kwargs):
        with self:
            return func(*args, **kwargs)
    return wrapper

make_serializable

make_serializable(obj: Any) -> Any

Recursively converts numpy and non-serializable objects to native, standard JSON-compliant types.

PARAMETER DESCRIPTION
obj

The nested object or value to convert.

TYPE: Any

RETURNS DESCRIPTION
Any

Standard Python dictionary, list, float, int, bool, or string.

TYPE: Any

Source code in src\xpyrment\core\serialization.py
def make_serializable(obj: Any) -> Any:
    """Recursively converts numpy and non-serializable objects to native, standard JSON-compliant types.

    Args:
        obj (Any): The nested object or value to convert.

    Returns:
        Any: Standard Python dictionary, list, float, int, bool, or string.
    """
    import numpy as np

    if isinstance(obj, dict):
        return {str(k): make_serializable(v) for k, v in obj.items()}
    elif isinstance(obj, (list, tuple, set)):
        return [make_serializable(x) for x in obj]
    elif isinstance(obj, np.ndarray):
        return make_serializable(obj.tolist())
    elif isinstance(obj, (np.integer, int)):
        return int(obj)
    elif isinstance(obj, (np.floating, float)):
        val = float(obj)
        if np.isnan(val) or np.isinf(val):
            return str(val)  # String standard for portable audit trails
        return val
    elif isinstance(obj, (np.bool_, bool)):
        return bool(obj)
    elif obj is None:
        return None
    elif hasattr(obj, "to_dict"):
        try:
            return obj.to_dict()
        except Exception:
            return str(obj)
    else:
        try:
            # Handle statsmodels or other custom objects
            if type(obj).__name__ == "ContrastResults" or type(obj).__name__ == "RegressionResultsWrapper":
                return str(obj)
            return str(obj)
        except Exception:
            return None

serialize_to_json

serialize_to_json(
    obj: Any, indent: Optional[int] = None
) -> str

Converts a nested object recursively to serializable format and dumps it as a JSON string.

PARAMETER DESCRIPTION
obj

The object to convert and serialize.

TYPE: Any

indent

Indentation level for pretty-printing.

TYPE: Optional[int] DEFAULT: None

RETURNS DESCRIPTION
str

Validated JSON string.

TYPE: str

Source code in src\xpyrment\core\serialization.py
def serialize_to_json(obj: Any, indent: Optional[int] = None) -> str:
    """Converts a nested object recursively to serializable format and dumps it as a JSON string.

    Args:
        obj (Any): The object to convert and serialize.
        indent (Optional[int]): Indentation level for pretty-printing.

    Returns:
        str: Validated JSON string.
    """
    return json.dumps(make_serializable(obj), indent=indent)

configure_telemetry

configure_telemetry(
    level: int = INFO, stream: Any = stdout
) -> Logger

Configures and registers the centralized JSON telemetry logger handlers.

PARAMETER DESCRIPTION
level

Log filter level threshold (e.g. logging.INFO). Defaults to logging.INFO.

TYPE: int DEFAULT: INFO

stream

Output stream target. Defaults to sys.stdout.

TYPE: Any DEFAULT: stdout

RETURNS DESCRIPTION
Logger

logging.Logger: Configured telemetry Logger instance.

Source code in src\xpyrment\core\telemetry.py
def configure_telemetry(level: int = logging.INFO, stream: Any = sys.stdout) -> logging.Logger:
    """Configures and registers the centralized JSON telemetry logger handlers.

    Args:
        level (int): Log filter level threshold (e.g. logging.INFO). Defaults to logging.INFO.
        stream (Any): Output stream target. Defaults to sys.stdout.

    Returns:
        logging.Logger: Configured telemetry Logger instance.
    """
    logger = logging.getLogger("xpyrment.telemetry")
    logger.setLevel(level)
    logger.propagate = False

    # Remove duplicates
    for handler in list(logger.handlers):
        logger.removeHandler(handler)

    handler = logging.StreamHandler(stream)
    formatter = JSONFormatter()
    handler.setFormatter(formatter)
    logger.addHandler(handler)

    return logger

get_logger

get_logger() -> Logger

Returns the centralized JSON telemetry logger instance.

Ensures the logger is fully configured with default handlers if unconfigured.

RETURNS DESCRIPTION
Logger

logging.Logger: Active telemetry Logger.

Source code in src\xpyrment\core\telemetry.py
def get_logger() -> logging.Logger:
    """Returns the centralized JSON telemetry logger instance.

    Ensures the logger is fully configured with default handlers if unconfigured.

    Returns:
        logging.Logger: Active telemetry Logger.
    """
    logger = logging.getLogger("xpyrment.telemetry")
    if not logger.handlers:
        configure_telemetry()
    return logger