back to home

autonomous trader agent

#python#fastapi#postgresql#docker#github actions#zerodha kite connect

autonomous trading system for indian equity markets using cross-sectional reversal scoring on 96 nifty stocks. backtested 8.6% cagr with 60% win rate over 5.4 years.

/ the strategy

  • cross-sectional reversal — ranks 96 nifty stocks by magnitude of decline over a 5-21 day lookback, buys the most oversold, holds for 5 trading days
  • the edge is behavioral: panic selling pushes stocks below fair value, creating a mean-reversion opportunity that algorithms can't easily arbitrage away
  • information coefficient: +0.020 (large-cap), +0.025 (midcap) — a small but consistent edge compounded over thousands of trades

/ research journey

  • tested 6 strategies systematically before finding the edge
  • 5 failed: intraday ml prediction, breakout detection, 5-min mean reversion, 30-min trend following, cross-sectional ml — indian large-cap stocks are too efficient at intraday resolution
  • daily reversal was the only signal that survived — driven by human psychology, not technical patterns
  • evolved through 4 versions of allocation logic, each improving capital efficiency — the underlying signal never changed

/ how it works

  • 3-state regime classifier (bull/neutral/weak) using nifty vs 50-dma, momentum, and market breadth with a 2-day persistence filter
  • adaptive confidence scoring: continuous 0-1 score combining ic, rolling win rate, momentum, and breadth for smooth capital allocation
  • risk controls: regime-based exposure gates, soft drawdown dampening, recovery boost, kill switches on declining win rates or negative ic, panic filters
  • a/b pipeline testing with independent scan intervals, capital pools, and paper broker instances for isolated comparison

/ results

  • backtested over 5.4 years (oct 2020 – jan 2025): 8.6% cagr, 42% total return, 60% win rate
  • survived the 2025-26 bear market with 6.5% cagr and 9-16% max drawdown
  • large-cap returns: +38% | midcap returns: +108% (2.8x higher)
  • ~52% average capital deployment — the rest held as a protective cash buffer

/ what's next

  • this is the target policy model for rl training — the stock-trader-env project provides the verifiable reward environment
  • goal: use grpo to train the agent's decision-making on thousands of simulated rollouts, optimizing for sharpe ratio and risk discipline
  • replacing rule-based scoring with a learned policy that adapts to market conditions

/ how it works

01regime classifier evaluates market conditions (bull/neutral/weak)
02confidence scorer computes allocation weight from ic, win rate, momentum, breadth
03reversal scanner ranks stocks by decline magnitude across lookback windows
04risk guardian validates exposure limits, drawdown gates, and kill switches
05trade executor places orders via zerodha kite connect (cnc for swing holding)

/ features

cross-sectional reversal scoring
ranks 96 nifty stocks by decline magnitude. information coefficient: +0.020 (large-cap), +0.025 (midcap). exploits behavioral overreaction — a structural edge driven by psychology, not patterns algorithms can arbitrage away.
3-state regime classifier
classifies market as bull (65-85% exposure), neutral (50-75%), or weak (8-40%) using nifty vs 50-dma, momentum, and breadth. 2-day persistence filter prevents whipsawing.
adaptive confidence scoring
continuous 0-1 scoring combining information coefficient, rolling win rate, momentum, and market breadth. replaces hard thresholds for smoother capital allocation.
a/b pipeline testing
two independent pipelines with separate scan intervals and capital pools. each pipeline runs its own paper broker instance for isolated comparison.
risk management layers
regime-based exposure gates, soft drawdown dampening (gentle in bull, aggressive in weak), recovery boost when signal improves during drawdown recovery, and kill switches that pause trading on declining win rates or negative ic.
research-driven development
tested 6 strategies systematically before finding the edge. 5 failed (ml prediction, breakouts, intraday mean reversion, trend following, cross-sectional ml). every version improvement came from better capital allocation — the signal never changed.