LHS
lhs
Latin Hypercube Sampling (LHS) classical and space-filling Design of Experiments (DoE) matrices.
This module provides the LatinHypercubeDesign class, which constructs Latin Hypercube Sampling (LHS) matrices.
LHS is a modern space-filling design technique widely utilized in computer experiments, high-dimensional simulation
scenarios (such as Monte Carlo models), and black-box software testing to explore large design spaces with minimal runs.
| CLASS | DESCRIPTION |
|---|---|
LatinHypercubeDesign |
Generates Latin Hypercube Sampling designs for multidimensional factors. |
LatinHypercubeDesign
Bases: DesignMatrix
Generates Latin Hypercube Sampling designs for multidimensional factors.
LHS is a generalization of a Latin Square to an arbitrary number of dimensions. It ensures that the ensemble of random samples is distributed evenly across the entire multi-dimensional space, preventing the accidental clustering of points that can occur with simple random sampling.
Mathematical Definition and Properties
Let \(N\) be the number of target samples (runs) and \(k\) be the number of factors. A Latin Hypercube design is represented by an \(N \times k\) matrix where: 1. Projection of the \(N\) sample points onto any single factor dimension yields exactly one sample in each of \(N\) equally probable intervals. 2. Specifically, the range of each factor is divided into \(N\) non-overlapping intervals of equal probability: $$ I_j = \left[ \frac{j-1}{N}, \frac{j}{N} \right] \quad \text{for } j \in {1, 2, \dots, N} $$ 3. Within each interval \(I_j\), a point is sampled (either at the midpoint or randomly): $$ x_j = \frac{j-1 + U_j}{N} $$ where \(U_j \sim \text{Uniform}(0, 1)\) is a random noise variable. 4. The sampled values for the \(k\) dimensions are paired using independent, random permutations of the set \(\{1, 2, \dots, N\}\) for each column.
Maximin LHS Space-Filling Variant
Standard random LHS can still yield sample points clustered close together in multidimensional space. To prevent this, Maximin LHS optimizes the permutations to maximize the minimum Euclidean distance between any two sample points: $$ \max_{\Pi} \min_{a \neq b} \lVert x_a - x_b \rVert_2 $$ This forces the points to spread out as far as possible, filling the multidimensional space uniformly.
Pseudocode for the Algorithm
function generate_lhs(factors, N):
1. Determine k = number of factors.
2. Initialize N x k matrix LHS_coded.
3. For each column j from 1 to k:
a. Create array of interval indices: [1, 2, ..., N].
b. Shuffle the interval indices randomly (permutation).
c. For each row i:
Draw random uniform noise U ~ Uniform(0, 1).
Compute coded value: cell = (shuffled_index[i] - 1 + U) / N.
LHS_coded[i, j] = cell.
4. If Maximin optimization is enabled:
Iterate permutations to maximize min_distance(row_a, row_b).
5. Map the coded values in [0, 1] to the physical bounds specified in factors.
6. Return DataFrame.
| ATTRIBUTE | DESCRIPTION |
|---|---|
num_samples |
The exact number of samples (runs) to draw.
TYPE:
|
Examples:
Example
| PARAMETER | DESCRIPTION |
|---|---|
factors
|
Mapping of factor labels to their lower and upper physical boundaries.
TYPE:
|
num_samples
|
The target number of samples.
TYPE:
|
seed
|
Random seed for reproducibility. Defaults to 42.
TYPE:
|
| METHOD | DESCRIPTION |
|---|---|
generate |
Generates the Latin Hypercube design matrix. |
Source code in src\xpyrment\design\doe\lhs.py
generate
Generates the Latin Hypercube design matrix.
Partitions interval ranges, draws independent uniform perturbations, shuffles columns, optionally optimizes minimum spacing, and maps results to physical bounds.
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
pd.DataFrame: A pandas DataFrame containing the LHS matrix. |