fix(research): unblock optimizer, fix phantom trades, parallelize by ArielB1980 · Pull Request #16 · ArielB1980/Kbot

ArielB1980 · 2026-03-31T06:18:16Z

Summary

Fix 5 compounding root causes killing all signal generation — config section mismatch (strategy.* → risk.*), holdout sentinel (-999%), partial data sentinel, score thresholds too high for neutral bias, 4H structure gate with no fallback
Fix phantom trade DB leak — research replay was writing to production trades table when DATABASE_URL was set (disable_db_mock=True on prod). Changed to always mock. Cleaned 4,517+ phantom trades from production DB.
Add guided parameter mutation — ParameterMemory tracks per-parameter score deltas, uses softmax-weighted selection and directional bias. Adaptive step decay (annealing) from 100% → 30%. Cross-run warm-start persistence.
Quiet Telegram — only notify on meaningful results (new best candidate, run complete, convergence, promotion)
Parallelize research across CPUs — splits symbols across N-1 workers (3 on 4-CPU droplet), merges results for post-run hooks. ~3x speedup.

Commits

fix(research): unblock optimizer — strip env overrides, widen exploration, add bounds
chore: add ruff as dev dependency
fix: eliminate phantom trades and unblock research signal generation
feat(research): expand optimizer to full entry+exit+risk parameter space
feat(research): quiet Telegram — only notify on meaningful results
fix(research): unblock optimizer — 5 root causes killing all signal generation
fix(research): stop replay phantom trades polluting production DB
feat(research): parallelize research across available CPUs

Test plan

Research daemon running with 3 parallel workers on production (BTC/ETH, SOL/XRP, ADA/LINK)
Verified 0 phantom trades written since DB mock fix deployed
Verified optimizer is actively generating trades in replay (13 BTC/USD baseline trades, 353 positions across iterations)
Cleaned production DB: 232 verified real trades remain, -$15.05 total PnL
Live bot stopped pending research results (no capital at risk)

🤖 Generated with Claude Code

…tion, add bounds Root cause: REPLAY_OVERRIDE_* env vars exported by the continuous daemon were overriding the very config parameters research was trying to mutate, making all candidates behave identically to baseline ("uninformative surface" after 4 probes, every symbol, every cycle — 1,503 cycles with 0 results). Fix 1: Strip REPLAY_OVERRIDE_* env vars during replay evaluation so that config_overrides from the harness actually reach the strategy engine. Fix 2: Raise uninformative_surface_probe_count from 4 to 12 so the harness tries more mutations before giving up on a symbol. Fix 3: When baseline produces 0 trades, use aggressive gate-lowering exploration for the first 6 iterations to find the signal region. Fix 5: Add PARAMETER_BOUNDS with min/max per parameter to prevent degenerate values. _mutate_params now clamps to bounds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Phantom trade fixes (6): - Add 30-min dedup guard in phantom position import to prevent restart-loop duplicates - Add 6-hour age gate on trade recording to skip stale historical artifacts - Re-classify phantom positions after Case D purge in production takeover - Mark stale zero-qty positions as trade_recorded on persistence load - Isolate replay harness position DB via POSITION_PERSISTENCE_PATH env var - Harden systemd restart policy (on-failure, 30s delay, burst limits) Research pipeline fixes (4): - Widen fib_proximity_bps 60→120 bps (max 80→160) — singular signal chokepoint - Lower neutral score thresholds (tight_smc 65→55, wide_structure 60→50) for environments without 200-day EMA data - Fix instrument spec registry clobbering — merge duplicate _index() calls into single call combining both BTC and XBT-aliased spec formats - Replace synthetic min_size=1 (whole unit) with realistic per-asset minimums (BTC: 0.0001, ETH: 0.001, etc.) so position sizing works with small equity Research allowlist additions: - strategy.fib_proximity_bps, strategy.min_score_tight_smc_neutral, strategy.min_score_wide_structure_neutral now optimizable Control test script added for March 12 validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The research optimizer was only tuning 18 entry gate parameters while the core problem was fundamentally bad exits: 6.5% win rate, 100% stop-loss exits, 36-second average hold times. Stops were 0.15-0.30 ATR (hit by noise instantly), and TP/trailing/risk params were completely locked. Changes: - Expand allowlist from 18 → 33 optimizable parameters - Add exit mechanics: TP1/TP2 R-multiples, close percentages, runner %, trailing stop ATR multiplier - Add risk sizing: risk_per_trade_pct, target_leverage - Add cost constraints: tight_smc_cost_cap_bps, min_rr_multiple, fee_edge_multiple_k - Widen stop loss bounds significantly (tight_smc max from 1.5 → 3.0 ATR, wide_structure max from 2.5 → 4.0 ATR) - Handle Optional config sections (multi_tp) in param reader with bound midpoint fallback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add important flag to _notify(). Default is quiet (log only). Only these events send to Telegram: new best candidate, convergence, run completed, replay gate results, and promotion queued. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…eneration 1. Wrong config section in allowlist: tight_smc_cost_cap_bps, tight_smc_min_rr_multiple, fee_edge_multiple_k live on RiskConfig not StrategyConfig. Every replay crashed with "StrategyConfig has no field" → -999% sentinel → all symbols skipped. Fixed prefix from strategy.* to risk.*. 2. Holdout sentinel (-999%) caused immediate symbol skip: when holdout window had 0 trades, evaluator returned -999% sentinel which triggered _is_non_informative_baseline → optimizer never tried any mutations. Changed to return 0% with penalty score so optimizer can explore looser parameters. 3. partial_data_non_comparable produced -999% sentinel: symbols with incomplete 15m coverage (30 days vs 120 requested) hit this path. Changed to return 0-trade metrics instead of hard failure. 4. Score thresholds too high for neutral bias: without EMA200 (needs 200 candles, only 120 available), all signals get neutral bias → HTF alignment capped at 10, EMA slope = 0 → max realistic score ~60. Lowered neutral thresholds: tight_smc_neutral 55→35, wide_structure_neutral 50→30. 5. 4H_STRUCTURE_REQUIRED hard gate blocked ~25% of signals: 1H structure fallback existed in code but was disabled. Enabled structure_fallback_enabled=true with reduced penalty (15→5 points). Result: research now generates 9+ trades per baseline eval (was 0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The evaluator passed disable_db_mock=True when DATABASE_URL was set, causing the replay harness to write simulated trades directly into the production trades table. This produced 4,517 phantom trades with <60s duration and -$54K fake PnL. Fix: always use the DB mock during research replay. Replay reads candle data from CSV files, not the DB — it never needs real DB writes. Also cleaned 4,517 phantom trade records from production DB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Split symbol optimization across N-1 workers (one per CPU minus OS/collector). Each worker gets isolated state, logs, and output dirs. Results are merged after all workers complete so post-run hooks work unchanged. On 4-CPU droplet: 3 workers × 2 symbols = ~3x faster than sequential. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move StartLimitIntervalSec/StartLimitBurst to [Unit] section (correct placement per systemd docs). Add one-shot trading_review.py diagnostic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

uv sync --dev uses [dependency-groups] not [project.optional-dependencies]. pytest was only in the latter, causing CI to fail with "Failed to spawn: pytest". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…dels shim These types were referenced by position_manager_v2.py and position_evaluator.py but never defined in the shim module, causing ImportError in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The parallel worker monitoring loop was dying every ~10 minutes because `[[ -n "" ]] && VAR="val"` returns exit 1 when the test fails, which under `set -e` kills the entire daemon. Changed && chains to if/then, wrapped loop in set +e, and added error tolerance to state merge step. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Three changes to fix replay workers using only 30% CPU: 1. Disable OHLCV fetcher rate limiting in replay (semaphore 8→1000, min_delay 200ms→0, retries 3→1) — exchange sim is in-memory, no real API to throttle against. This was the main bottleneck (95% of time spent in epoll_wait). 2. Set LOG_LEVEL=WARNING for research workers — INFO logging was producing 280MB/worker of I/O, choking disk and CPU. 3. Support ${VAR:-default} syntax in config.yaml env var expansion. Result: CPU utilization 16%→71%, per-worker 30%→75%, load 1.3→3.3. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two changes to improve research result quality: 1. Default objective mode net_pnl_only → risk_adjusted. The composite score includes drawdown penalty (-0.8), Sharpe/Sortino (0.35), and win rate (0.1) — critical for avoiding high-variance curve fits that would fail live (19% win rate baseline). 2. Promotion minimum trades 10 → 20. With 120-day windows and 30% holdout, 10 trades is too few for statistical confidence. Also fixed .env on production server: - RESEARCH_CONT_WINDOW_OFFSETS: 0 → 0,90,180 (3 walk-forward windows) - RESEARCH_CONT_REPLAY_TIMEFRAMES: added 1m back Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two new post-cycle diagnostics that test whether SMC signal logic has genuine edge, independent of parameter optimization: 1. Signal Directional Accuracy — intercepts every signal during replay and measures if price moves in the predicted direction at 1h/4h/24h. Reports hit rate, p-value (binomial test vs 50%), and breakdown by setup type (OB/FVG/BOS/TREND) and direction. 2. Random Entry Baseline — runs N replay trials with random entries at the same frequency as the real strategy but using the same risk management stack. If random entries produce similar returns, the signals have no edge. Both run automatically after each research cycle (alongside the existing counterfactual twin) with timeout protection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ArielB1980 and others added 8 commits March 30, 2026 23:18

chore: add ruff as dev dependency

1d30954

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ArielB1980 changed the title ~~fix(research): unblock optimizer and harden parameter exploration~~ fix(research): unblock optimizer, fix phantom trades, parallelize Apr 2, 2026

ArielB1980 and others added 7 commits April 2, 2026 16:26

chore: fix systemd unit section and add trading review script

51d1ea8

Move StartLimitIntervalSec/StartLimitBurst to [Unit] section (correct placement per systemd docs). Add one-shot trading_review.py diagnostic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(ci): add pytest and dev tools to dependency-groups

5341d2f

uv sync --dev uses [dependency-groups] not [project.optional-dependencies]. pytest was only in the latter, causing CI to fail with "Failed to spawn: pytest". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(ci): add missing ActionType, ManagementAction, DecisionTick to mo…

9011b42

…dels shim These types were referenced by position_manager_v2.py and position_evaluator.py but never defined in the shim module, causing ImportError in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ArielB1980 merged commit 3b9efd9 into main Apr 5, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(research): unblock optimizer, fix phantom trades, parallelize#16

fix(research): unblock optimizer, fix phantom trades, parallelize#16
ArielB1980 merged 15 commits intomainfrom
ArielB1980/audit-research-value

ArielB1980 commented Mar 31, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ArielB1980 commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Commits

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ArielB1980 commented Mar 31, 2026 •

edited

Loading