Skip to content

Add research autolearn, counterfactual twin, and deep backfill tooling#13

Merged
ArielB1980 merged 8 commits intomainfrom
chore/split-research-automation
Mar 19, 2026
Merged

Add research autolearn, counterfactual twin, and deep backfill tooling#13
ArielB1980 merged 8 commits intomainfrom
chore/split-research-automation

Conversation

@ArielB1980
Copy link
Copy Markdown
Owner

@ArielB1980 ArielB1980 commented Mar 18, 2026

Summary

  • add continuous research autolearn/campaign-gate/filter/funnel tooling and wire richer replay/live gate instrumentation to make blocker diagnosis deterministic
  • add counterfactual twin utilities and tests for decision-tape uplift comparison and batch ranking of candidates
  • add CoinAPI-backed deep backfill path plus related data acquisition support to improve research candle completeness
  • extend targeted unit coverage for research evaluator eligibility/timeout behavior and per-symbol config resolver behavior

Test plan

  • uv run pytest tests/unit/test_research_filter_report.py tests/unit/test_research_autolearn.py tests/unit/test_research_campaign_gate.py tests/unit/test_research_evaluator_eligibility.py tests/unit/test_research_evaluator_replay_timeout.py tests/unit/test_counterfactual_twin.py
  • reviewed git diff --stat main...HEAD for scoped changes
  • run wider integration suite before merge (recommended)

Made with Cursor


Note

Medium Risk
Medium risk because it touches core trading/replay pathways (LiveTrading, ExecutionGateway, replay exchange sim) with new replay-mode bypasses and logging/instrumentation that could affect behavior if accidentally enabled in non-replay environments.

Overview
Adds a new continuous research toolchain: scripts to run campaign-level stop gating, auto-generate next-cycle env overrides from prior runs, summarize ablation batches, and produce blocker/funnel reports from run logs and recorded decision events.

Introduces counterfactual decision-tape instrumentation by emitting COUNTERFACTUAL_DECISION/COUNTERFACTUAL_ACTION events with stable decision_ids, plus a new counterfactual_twin module to stitch these events into a deterministic tape and score parameter overrides via utility-uplift ranking.

Expands the replay/research harness for higher determinism and throughput: symbol resolution across spot/futures naming, explicit order-update polling, forced flatten-at-end, optional DB mocking, relaxed replay gates/limits via env flags, and additional replay metrics (e.g., paused ticks, trades closed).

Adds CoinAPI-backed deep backfill support: a new CoinAPIClient, DataAcquisition.fetch_spot_historical(..., source=...) routing, and a deep_backfill_daily.py utility for large lookback candle ingestion; also updates research control/nightly scripts to default to replay-oriented settings and add continuous-daemon commands.

Written by Cursor Bugbot for commit 31003e9. This will update automatically on new commits. Configure here.

Automate post-cycle learning overrides, campaign stop gating, and blocker/funnel diagnostics while wiring replay research and live decision paths with richer telemetry and tests so non-baseline discovery can be debugged deterministically.

Made-with: Cursor
Introduce deterministic decision-tape scoring utilities for candidate comparisons and add focused tests so promotion decisions can be grounded in measured utility uplift.

Made-with: Cursor
Provide a CoinAPI OHLCV client and deep backfill driver so missing higher-timeframe history can be filled reliably when Kraken coverage is incomplete.

Made-with: Cursor
Validate that symbol override resolution applies only to matching symbols and preserves base config values for non-target symbols.

Made-with: Cursor
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 4 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread src/live/live_trading.py Outdated
# Stop processing (no new entries while kill switch is active)
return
# Stop processing (no new entries while kill switch is active)
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replay bypass skips kill switch return, allowing continued tick processing

High Severity

When replay_fail_open is True (DATA_FAILURE kill switch in replay mode), the code logs a warning but does not return. Execution falls through past the entire kill switch block into normal tick processing (order timeouts, data health checks, analysis, etc.). The original code always returned after handling an active kill switch. For non-DATA_FAILURE kill switch reasons in replay mode, the else branch handles them and returns — but the DATA_FAILURE path silently continues. This means replay ticks proceed with an active kill switch, which contradicts the safety invariant that no new entries occur while the kill switch is active.

Fix in Cursor Fix in Web

requested_leverage=requested_leverage,
spec_summary=None,
)
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NO_SPEC bypass proceeds without spec causing downstream failures

Medium Severity

When _replay_relaxed_signal_gates is True and no instrument spec is found, the code logs and continues instead of skipping. However, spec remains None and is used downstream for step size calculation, min-size enforcement, and position sizing. Code later references spec.size_step, spec.tick_size, etc., which will raise AttributeError on None. The bypass only logs — it doesn't provide a fallback spec object.

Fix in Cursor Fix in Web

return raw
cached = self._symbol_resolution_cache.get(raw)
if cached:
return cached
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbol resolution cache returns empty string as cache miss

Low Severity

_resolve_market_symbol checks if cached: which treats an empty string as a cache miss. While the early return for empty raw prevents empty strings from being cached, if a symbol resolves to a non-empty string that's falsy in some edge case, the cache would be bypassed. More importantly, if any resolved symbol happens to be a string that Python considers falsy, the cache lookup would fail and re-resolve every time, causing a performance issue.

Fix in Cursor Fix in Web

-int(r["candidate_open_count"]),
),
reverse=True,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counterfactual twin sort order inverts candidate ranking

Medium Severity

evaluate_candidate_batch sorts by a tuple including -int(r["candidate_open_count"]) as the third key with reverse=True. The negative sign combined with reverse sort means candidates with more opens rank lower (double negation). If the intent is to rank higher-open-count candidates higher when uplift and delta are tied, the negation produces the opposite order.

Fix in Cursor Fix in Web

Restore kill-switch early return semantics in replay fail-open mode, prevent NO_SPEC ablation from dereferencing missing specs, correct counterfactual candidate tie-break sorting, harden symbol-resolution cache lookup, and ensure GitHub Actions installs pytest before running unit tests.

Made-with: Cursor
Install pytest-asyncio in replay gate workflow so async unit tests collect successfully under GitHub Actions.

Made-with: Cursor
Use safe decimal coercion for strategy config values, restore stop order type used by missing-stop placement, normalize precision drift in SMC golden reasoning comparisons, and complete TP backfill test fixtures with explicit signal cooldown config.

Made-with: Cursor
Normalize numeric literals in SMC reasoning to 9 decimal places so snapshot comparisons are stable across minor platform floating-point representation differences.

Made-with: Cursor
@ArielB1980 ArielB1980 merged commit ede5d7d into main Mar 19, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant