API Reference: Conformal Prediction¶
Distribution-free prediction intervals with coverage guarantees.
⚠️ Time Series Caveat: Conformal prediction provides exact coverage guarantees only under exchangeability (i.i.d. data). For time series, coverage is approximate because temporal dependence violates exchangeability. Use AdaptiveConformalPredictor for distribution shift, and expect coverage to be within ±5% of nominal for well-behaved series.
When to Use¶
graph TD
A[Need prediction intervals?] --> B{Data stationarity?}
B -->|Stationary| C[SplitConformalPredictor]
B -->|Non-stationary/drift| D[AdaptiveConformalPredictor]
B -->|Unknown| E[Start with Adaptive]
C --> F{Coverage sufficient?}
D --> F
F -->|Yes| G[Deploy]
F -->|No, undercoverage| H[Increase alpha or use Adaptive]
F -->|Regime-dependent| I[Check per-regime coverage]
Method Comparison¶
Method |
Best For |
Tradeoff |
|---|---|---|
|
Stationary data |
Tighter intervals, coverage guarantee |
|
Regime shifts, drift |
Adapts to changes, no finite-sample guarantee |
|
Model uncertainty |
Computationally expensive |
Common Mistakes¶
Using Split conformal during regime changes
Fixed quantile fails when volatility spikes
Example 15 shows Adaptive maintaining coverage during crashes
Evaluating coverage on calibration set
Coverage must be evaluated on holdout only
walk_forward_conformal()handles this correctly
Ignoring
undercoverage_warningCheck
CoverageDiagnostics.undercoverage_warningPersistent undercoverage indicates distribution shift
See Also: Uncertainty Tutorial, Example 05, Example 15
Data Classes¶
PredictionInterval¶
Container for prediction intervals.
@dataclass
class PredictionInterval:
point: np.ndarray # Point predictions
lower: np.ndarray # Lower bounds
upper: np.ndarray # Upper bounds
confidence: float # Nominal confidence (1 - alpha)
method: str # Method used
Properties:
Property |
Type |
Description |
|---|---|---|
|
|
Interval width at each point |
|
|
Mean interval width |
Methods:
coverage(actuals) -> float: Empirical coverageto_dict() -> dict: Convert to dictionary
Classes¶
SplitConformalPredictor¶
Split Conformal Prediction for regression.
class SplitConformalPredictor:
def __init__(self, alpha: float = 0.05)
Parameters:
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Miscoverage rate (0.05 = 95% intervals) |
Attributes:
quantile_: Calibrated quantile (aftercalibrate())
Warning: Assumes exchangeability. For time series, consider AdaptiveConformalPredictor.
Methods¶
calibrate(predictions, actuals)¶
Calibrate on held-out data.
def calibrate(
self,
predictions: np.ndarray,
actuals: np.ndarray,
) -> SplitConformalPredictor
Requires: At least 10 calibration samples
Returns: self (for chaining)
predict_interval(predictions)¶
Construct prediction intervals.
def predict_interval(
self,
predictions: np.ndarray,
) -> PredictionInterval
Returns: PredictionInterval with coverage guarantee
Example:
from temporalcv import SplitConformalPredictor
conformal = SplitConformalPredictor(alpha=0.10) # 90% intervals
conformal.calibrate(cal_preds, cal_actuals)
intervals = conformal.predict_interval(test_preds)
print(f"Coverage: {intervals.coverage(test_actuals):.1%}")
print(f"Mean width: {intervals.mean_width:.4f}")
AdaptiveConformalPredictor¶
Adaptive Conformal Inference for time series with distribution shift.
class AdaptiveConformalPredictor:
def __init__(
self,
alpha: float = 0.05,
gamma: float = 0.1,
)
Parameters:
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Target miscoverage rate |
|
|
|
Adaptation rate (higher = faster) |
Attributes:
quantile_history: List of adaptive quantilescurrent_quantile: Current quantile
Methods¶
initialize(initial_predictions, initial_actuals)¶
Initialize with calibration data.
update(prediction, actual)¶
Update quantile based on coverage feedback.
def update(self, prediction: float, actual: float) -> float
Returns: Updated quantile
predict_interval(prediction)¶
Construct interval for single prediction.
def predict_interval(self, prediction: float) -> Tuple[float, float]
Returns: (lower, upper) tuple
Example:
from temporalcv import AdaptiveConformalPredictor
adaptive = AdaptiveConformalPredictor(alpha=0.10, gamma=0.05)
adaptive.initialize(cal_preds, cal_actuals)
for pred, actual in zip(test_preds, test_actuals):
lower, upper = adaptive.predict_interval(pred)
adaptive.update(pred, actual)
print(f"Interval: [{lower:.3f}, {upper:.3f}]")
BootstrapUncertainty¶
Bootstrap-based prediction intervals.
class BootstrapUncertainty:
def __init__(
self,
n_bootstrap: int = 100,
alpha: float = 0.05,
random_state: int = 42,
)
Parameters:
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Bootstrap samples |
|
|
|
Miscoverage rate |
|
|
|
Random seed |
Methods¶
fit(predictions, actuals)¶
Fit bootstrap estimator on residuals.
predict_interval(predictions)¶
Construct bootstrap prediction intervals.
Functions¶
evaluate_interval_quality¶
Evaluate prediction interval quality.
def evaluate_interval_quality(
intervals: PredictionInterval,
actuals: np.ndarray,
) -> dict[str, object]
Returns dict with:
Key |
Description |
|---|---|
|
Empirical coverage |
|
Nominal coverage (1 - α) |
|
coverage - target |
|
Average interval width |
|
Proper scoring rule (lower = better) |
|
Coverage difference by prediction magnitude |
Interval Score:
IS = width + (2/α) × (lower - y) × I(y < lower) + (2/α) × (y - upper) × I(y > upper)
walk_forward_conformal¶
Apply conformal prediction to walk-forward results.
def walk_forward_conformal(
predictions: np.ndarray,
actuals: np.ndarray,
calibration_fraction: float = 0.3,
alpha: float = 0.05,
) -> Tuple[PredictionInterval, dict[str, object]]
Parameters:
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
Walk-forward predictions |
|
|
required |
Corresponding actuals |
|
|
|
Fraction for calibration |
|
|
|
Miscoverage rate |
Returns: (intervals, quality_metrics)
CRITICAL: Coverage computed ONLY on post-calibration holdout.
Example:
from temporalcv import walk_forward_conformal
intervals, quality = walk_forward_conformal(
predictions=all_preds,
actuals=all_actuals,
calibration_fraction=0.3,
alpha=0.10
)
print(f"Calibration samples: {quality['calibration_size']}")
print(f"Holdout coverage: {quality['coverage']:.1%}")
print(f"Interval score: {quality['interval_score']:.4f}")
Method Comparison¶
Method |
Pros |
Cons |
|---|---|---|
Split Conformal |
Coverage guarantee, simple |
Needs separate calibration set |
Adaptive Conformal |
Handles drift |
No finite-sample guarantee |
Bootstrap |
No assumptions |
Computationally expensive |
Coverage Diagnostics¶
CoverageDiagnostics¶
Detailed coverage diagnostics for conformal prediction intervals.
@dataclass
class CoverageDiagnostics:
overall_coverage: float # Empirical coverage
target_coverage: float # Nominal coverage (1 - α)
coverage_gap: float # target - empirical
undercoverage_warning: bool # True if gap > threshold
coverage_by_window: Dict[str, float] # Window-based coverage
coverage_by_regime: Optional[Dict[str, float]] # Per-regime coverage
n_observations: int # Total observations
compute_coverage_diagnostics¶
Compute detailed coverage diagnostics for prediction intervals.
def compute_coverage_diagnostics(
intervals: PredictionInterval,
actuals: np.ndarray,
*,
target_coverage: Optional[float] = None,
window_size: int = 50,
regimes: Optional[np.ndarray] = None,
undercoverage_threshold: float = 0.05,
) -> CoverageDiagnostics
Parameters:
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
Intervals to evaluate |
|
|
required |
Actual values |
|
|
|
Target level (uses interval.confidence if None) |
|
|
|
Rolling window size for time-based analysis |
|
|
|
Regime labels for stratified coverage |
|
|
|
Warning threshold for undercoverage |
Returns: CoverageDiagnostics with detailed coverage analysis
Example:
from temporalcv import (
SplitConformalPredictor,
compute_coverage_diagnostics,
)
conformal = SplitConformalPredictor(alpha=0.05)
conformal.calibrate(cal_preds, cal_actuals)
intervals = conformal.predict_interval(test_preds)
diag = compute_coverage_diagnostics(
intervals,
test_actuals,
regimes=volatility_regime, # Optional regime stratification
)
print(f"Coverage: {diag.overall_coverage:.1%}")
print(f"Target: {diag.target_coverage:.1%}")
print(f"Gap: {diag.coverage_gap:+.1%}")
if diag.undercoverage_warning:
print("WARNING: Coverage significantly below target!")
if diag.coverage_by_regime:
for regime, cov in diag.coverage_by_regime.items():
print(f" {regime}: {cov:.1%}")
Use Cases:
Production monitoring for coverage degradation
Identifying time periods with poor coverage
Regime-specific performance analysis