skillby derekcrosslu
QuantConnect Validation
QuantConnect walk-forward validation and Phase 5 robustness testing (project)
Installs: 0
Used in: 1 repos
Updated: 1d ago
$
npx ai-builder add skill derekcrosslu/quantconnect-validationInstalls to .claude/skills/quantconnect-validation/
# QuantConnect Validation Skill (Phase 5) **Purpose**: Walk-forward validation for Phase 5 robustness testing before deployment. **Progressive Disclosure**: This primer contains essentials only. Full details available via `qc_validate.py docs` command. --- ## When to Use This Skill Load when: - Running `/qc-validate` command - Testing out-of-sample performance - Evaluating strategy robustness - Making deployment decisions **Tool**: Use `python SCRIPTS/qc_validate.py` for walk-forward validation --- ## Walk-Forward Validation Overview **Purpose**: Detect overfitting and ensure strategy generalizes to new data. **Approach**: 1. **Training (in-sample)**: Develop/optimize on 80% of data 2. **Testing (out-of-sample)**: Validate on remaining 20% 3. **Compare**: Measure performance degradation **Example** (5-year backtest 2019-2023): - In-sample: 2019-2022 (4 years) - Training period - Out-of-sample: 2023 (1 year) - Testing period --- ## Key Metrics ### 1. Performance Degradation **Formula**: `(IS Sharpe - OOS Sharpe) / IS Sharpe` | Degradation | Quality | Decision | |-------------|---------|----------| | < 15% | Excellent | Deploy with confidence | | 15-30% | Acceptable | Deploy but monitor | | 30-40% | Concerning | Escalate to human | | > 40% | Severe | Abandon (overfit) | **Key Insight**: < 15% degradation indicates robust strategy that generalizes well. --- ### 2. Robustness Score **Formula**: `OOS Sharpe / IS Sharpe` | Score | Quality | Interpretation | |-------|---------|----------------| | > 0.75 | High | Strategy robust across periods | | 0.60-0.75 | Moderate | Acceptable but monitor | | < 0.60 | Low | Strategy unstable | **Key Insight**: > 0.75 indicates strategy maintains performance out-of-sample. --- ## Quick Usage ### Run Walk-Forward Validation ```bash # From hypothesis directory with iteration_state.json python SCRIPTS/qc_validate.py run --strategy strategy.py # Custom split ratio (default 80/20) python SCRIPTS/qc_validate.py run --strategy strategy.py --split 0.70 ``` **What it does**: 1. Reads project_id from `iteration_state.json` 2. Splits date range (80/20 default) 3. Runs in-sample backtest 4. Runs out-of-sample backtest 5. Calculates degradation and robustness 6. Saves results to `PROJECT_LOGS/validation_result.json` --- ### Analyze Results ```bash python SCRIPTS/qc_validate.py analyze --results PROJECT_LOGS/validation_result.json ``` **Output**: - Performance comparison table - Degradation percentage - Robustness assessment - Deployment recommendation --- ## Decision Integration After validation, the decision framework evaluates: **DEPLOY_STRATEGY** (Deploy with confidence): - Degradation < 15% AND - Robustness > 0.75 AND - OOS Sharpe > 0.7 **PROCEED_WITH_CAUTION** (Deploy but monitor): - Degradation < 30% AND - Robustness > 0.60 AND - OOS Sharpe > 0.5 **ABANDON_HYPOTHESIS** (Too unstable): - Degradation > 40% OR - Robustness < 0.5 OR - OOS Sharpe < 0 **ESCALATE_TO_HUMAN** (Borderline): - Results don't clearly fit above criteria --- ## Best Practices ### 1. Time Splits - **Standard**: 80/20 (4 years training, 1 year testing) - **Conservative**: 70/30 (more OOS testing) - **Very Conservative**: 60/40 (extensive testing) **Minimum OOS period**: 6 months (1 year preferred) --- ### 2. Never Peek at Out-of-Sample **CRITICAL RULE**: Never adjust strategy based on OOS results. - OOS is for **testing only** - Adjusting based on OOS defeats validation purpose - If you adjust, OOS becomes in-sample --- ### 3. Check Trade Count Both periods need sufficient trades: - **In-sample**: Minimum 30 trades (50+ preferred) - **Out-of-sample**: Minimum 10 trades (20+ preferred) Too few trades = unreliable validation. --- ### 4. Compare Multiple Metrics Don't just look at Sharpe: - Sharpe Ratio degradation - Max Drawdown increase - Win Rate change - Profit Factor degradation - Trade Count consistency All metrics should degrade similarly for robust strategy. --- ## Common Issues ### Severe Degradation (> 40%) **Cause**: Strategy overfit to in-sample period **Example**: - IS Sharpe: 1.5 ā OOS Sharpe: 0.6 - Degradation: 60% **Decision**: ABANDON_HYPOTHESIS **Fix for next hypothesis**: Simplify (fewer parameters), longer training period --- ### Different Market Regimes **Cause**: IS was bull market, OOS was bear market **Example**: - 2019-2022 (bull): Sharpe 1.2 - 2023 (bear): Sharpe -0.3 **Decision**: Not necessarily overfit, but not robust across regimes **Fix**: Test across multiple regimes, add regime detection --- ### Low Trade Count in OOS **Cause**: Strategy stops trading in OOS period **Example**: - IS: 120 trades ā OOS: 3 trades **Decision**: ESCALATE_TO_HUMAN (insufficient OOS data) --- ## Integration with /qc-validate The `/qc-validate` command workflow: 1. Read `iteration_state.json` for project_id and parameters 2. Load this skill for validation approach 3. Modify strategy for time splits (80/20) 4. Run in-sample and OOS backtests 5. Calculate degradation and robustness 6. Evaluate using decision framework 7. Update `iteration_state.json` with results 8. Git commit with validation summary --- ## Reference Documentation **Need implementation details?** All reference documentation accessible via `--help`: ```bash python SCRIPTS/qc_validate.py --help ``` **That's the only way to access complete reference documentation.** Topics covered in `--help`: - Walk-forward validation methodology - Performance degradation thresholds - Monte Carlo validation techniques - PSR/DSR statistical metrics - Common errors and fixes - Phase 5 decision criteria The primer above covers 90% of use cases. Use `--help` for edge cases and detailed analysis. --- ## Related Skills - **quantconnect** - Core strategy development - **quantconnect-backtest** - Phase 3 backtesting (qc_backtest.py:**) - **quantconnect-optimization** - Phase 4 optimization (qc_optimize.py:**) - **decision-framework** - Decision thresholds - **backtesting-analysis** - Metric interpretation --- ## Key Principles 1. **OOS is sacred** - Never adjust strategy based on OOS results 2. **Degradation < 15% is excellent** - Strategy generalizes well 3. **Robustness > 0.75 is target** - Maintains performance OOS 4. **Trade count matters** - Need sufficient trades in both periods 5. **Multiple metrics** - All should degrade similarly for robustness --- ## Example Decision ``` In-Sample (2019-2022): Sharpe: 0.97, Drawdown: 18%, Trades: 142 Out-of-Sample (2023): Sharpe: 0.89, Drawdown: 22%, Trades: 38 Degradation: 8.2% (< 15%) Robustness: 0.92 (> 0.75) ā DEPLOY_STRATEGY (minimal degradation, high robustness) ``` --- **Version**: 2.0.0 (Progressive Disclosure) **Last Updated**: November 13, 2025 **Lines**: ~190 (was 463) **Context Reduction**: 59%
Quick Install
$
npx ai-builder add skill derekcrosslu/quantconnect-validationDetails
- Type
- skill
- Author
- derekcrosslu
- Slug
- derekcrosslu/quantconnect-validation
- Created
- 4d ago