Financial Cross-Validation¶
Cross-validation with purging and embargo for financial data.
Overview¶
Implements CV techniques for financial ML where labels often overlap (e.g., 5-day forward returns share 4 days of data). Standard CV leaks information through this overlap.
Key Concepts¶
Purging: Remove training samples within
purge_gapof any test sampleEmbargo: Additional percentage of samples removed after test set
Label overlap: When labels use future data (e.g., forward returns)
Classes¶
PurgedKFold¶
K-fold with purging and embargo:
from temporalcv.cv_financial import PurgedKFold
cv = PurgedKFold(
n_splits=5,
purge_gap=5, # Days to purge around test set
embargo_pct=0.01, # 1% embargo after test
)
for train_idx, test_idx in cv.split(X, y, times=timestamps):
# train_idx has samples purged that overlap with test_idx
pass
CombinatorialPurgedCV¶
All (n choose k) combinations with purging:
from temporalcv.cv_financial import CombinatorialPurgedCV
cv = CombinatorialPurgedCV(
n_splits=5,
purge_gap=5,
)
PurgedWalkForward¶
Walk-forward with purging:
from temporalcv.cv_financial import PurgedWalkForward
cv = PurgedWalkForward(
n_splits=5,
purge_gap=5,
embargo_pct=0.01,
)
Data Classes¶
PurgedSplit¶
@dataclass(frozen=True)
class PurgedSplit:
train_indices: np.ndarray # Indices after purging
test_indices: np.ndarray # Test indices
n_purged: int # Samples removed by purging
n_embargoed: int # Samples removed by embargo
Usage Example¶
from temporalcv.cv_financial import PurgedKFold
import numpy as np
# Financial data with timestamps
X, y = ...
timestamps = pd.date_range(...)
# Label uses 5-day forward returns → purge_gap=5
cv = PurgedKFold(n_splits=5, purge_gap=5, embargo_pct=0.01)
scores = []
for train_idx, test_idx in cv.split(X, y, times=timestamps):
model.fit(X[train_idx], y[train_idx])
pred = model.predict(X[test_idx])
scores.append(compute_score(y[test_idx], pred))
References¶
De Prado (2018). “Advances in Financial Machine Learning.” Wiley. Chapter 7.
Lopez de Prado & Lewis (2019). “Detection of False Investment Strategies.”