Methodology & Advanced Topics

This document provides a technical overview of the advanced features and mathematical concepts underlying GeoLift, focusing on how SparseSC is implemented and operationalised.

For a more rigorous econometric exposition intended for Data Scientists and Statisticians, see the Mathematical Formalism of Synthetic Controls.

SparseSC Mathematical Foundation

The standard Synthetic Control Method minimises pre-treatment prediction error by constructing a convex combination of donor units.

GeoLift uses the SparseSC enhancement, which improves upon traditional synthetic control by jointly optimising:

Feature weights (V-matrix): A diagonal matrix determining which pre-treatment features (or time periods) matter most.
Unit weights (W-matrix): A vector of weights for combining control units.

The optimisation problem is:

$$\min_{V,W} \|X_1 - X_0 W\|_V^2 + \lambda \|W\|_1$$

where:

$X_1 \in \mathbb{R}^{K \times 1}$ represents $K$ pre-treatment features for the treated unit
$X_0 \in \mathbb{R}^{K \times N}$ represents $K$ pre-treatment features for $N$ control units
$W \in \mathbb{R}^{N \times 1}$ is the vector of unit weights (subject to simplex constraints: $w_i \geq 0$, $\sum w_i = 1$)
$V \in \mathbb{R}^{K \times K}$ is a diagonal matrix of feature weights
$\lambda > 0$ is the regularisation parameter controlling sparsity
$\|x\|_V^2 = x^T V x$ is the V-weighted squared norm
$\|W\|_1 = \sum_{i=1}^{N} |w_i|$ is the L1 norm promoting sparsity

By explicitly penalising non-zero weights through the L1 penalty, SparseSC typically selects a very small, interpretable number of donor units. This prevents overfitting to pre-treatment noise and avoids the “interpolation bias” that occurs when relying on a large number of weakly correlated donors.

Statistical Inference in GeoLift

Because regional marketing experiments typically involve a very small number of treated units (often just one), asymptotic statistical inference is invalid. GeoLift uses non-parametric methods to compute p-values and confidence intervals.

Placebo Inference (Permutation Tests)

The primary method for significance testing is in-space placebo permutations.

The algorithm iteratively reassigns the “treatment” status to every untreated unit in the donor pool.
It fits a completely new SparseSC model for each of these placebo units.
It calculates the pseudo-treatment effect (the prediction error in the post-treatment period) for each placebo.
The exact p-value is calculated as the proportion of placebo effects that are at least as extreme as the actual estimated treatment effect.

This approach controls for Type I error correctly, regardless of the underlying distribution of the data.

Advanced Configuration & Tuning

Regularisation Selection (Cross-Validation)

GeoLift selects the optimal $\lambda$ regularisation parameter using out-of-sample cross-validation over the pre-treatment period.

If the default automatic search does not yield a good fit, the configuration allows for direct intervention:

# Inference configuration options
sparse_sc_model_type: "retrospective"
sparse_sc_fast_estimation: true
sparse_sc_level: 0.95

sparse_sc_fast_estimation: When set to true, the algorithm uses an approximation for the V-matrix (feature weights) rather than jointly optimising V and W. This speeds up computation significantly with minimal loss of accuracy in datasets where time-series lags are highly collinear.
sparse_sc_model_type: Setting this to retrospective ensures the model adheres strictly to the pre/post treatment boundary defined in the configuration.

Donor Pool Constraints

By default, GeoLift includes all non-treated units in the donor pool. However, if business logic dictates that certain regions should not be used (e.g., they experienced a supply chain shock), they should be excluded from the input CSV entirely prior to analysis.

The recipes/donor_evaluator.py tool helps systematically identify the highest quality donors prior to running inference.

Performance Considerations

GeoLift relies heavily on matrix operations during the cross-validation and optimisation phases. The following options are supported to improve execution speed:

Power Analysis Parallelism: Set jobs: -1 (or pass --jobs -1 via CLI) to use all available CPU cores when simulating hundreds of treatment effects.
GPU Acceleration: The power calculator supports GPU acceleration. Pass --use-gpu to the CLI. Note that this requires cupy to be installed and matched to your system’s CUDA version (e.g., pip install cupy-cuda12x).
Inference Computation: The main inference step runs on CPU. To avoid thread oversubscription when running on multi-core servers, set environment variables such as OMP_NUM_THREADS or MKL_NUM_THREADS to a sensible value relative to your total core count.

Need More Help?

Implementation Details: See Mathematical Formalism
Step-by-step guidance: Check How-To Guides
Technical specifications: Review the Configuration Reference