Weather Bot — Episode 14: When 163 Trades Rewrote the Strategy — The v2.9 Overhaul
One month of running the bot. 163 closed trades. Starting capital: $198. Current balance: $199.
One dollar. That's what a month of 24/7 automated weather trading on Polymarket got me. The bot was technically profitable — 38 wins, 125 losses, +$6.93 in net profit — but most of that profit was getting eaten by trades that shouldn't have been placed in the first place.
So I sat down with the full dataset and let it tell me what to change.
What This Post Covers
The analysis session that rewrote the bot's strategy from the ground up. Two filters that turned a barely-surviving 23% win rate into a 32% winner with +44% ROI. The eight changes I shipped at once, the deployment that immediately went wrong, and the uncomfortable lesson about the gap between backtesting and real operation.
The Number That Started Everything
v2.8's full scorecard after one month:
| Metric | Value |
|---|---|
| Total trades | 163 |
| Wins / Losses | 38 / 125 |
| Win rate | 23.3% |
| Net profit | +$6.93 |
| Avg win | $3.10 |
| Avg loss | $0.89 |
| Breakeven win rate | 22.2% |
I was beating breakeven by 1.1 percentage points. That's not a strategy — that's a coin landing on its edge. One bad week and the whole thing flips negative.
But 163 trades is enough data to start asking real questions. Not "is the bot profitable?" but "which trades are profitable?"
The Golden Zone
I split the trades by two dimensions: when I bought (hours before peak temperature) and how much I paid (entry price).
The timing data was stark. Trades bought 36-48 hours before peak had a 32% win rate. Trades bought within 18 hours of peak? 31 out of 35 lost. Three percent win rate. I'd known same-day was bad since Episode 9, but the data showed it was worse than I thought — anything under 24 hours was bleeding money.
The price data told the same story from a different angle. Entry prices under $0.15 returned +121% ROI. Under $0.20, still +45%. Under $0.25, still +13%. But the moment I crossed $0.25, returns went negative. $0.25-$0.30 entry: -34% ROI. Above $0.30: -52%.
When I combined both filters — 36-48 hours before peak AND entry under $0.25 — the numbers jumped out:
13 wins out of 40 trades. 32% win rate. +44% ROI.
Same data. Same trades. Just filtered. The edge wasn't missing — it was hiding inside the noise of all the trades that shouldn't have been placed.
What I Changed
Eight modifications shipped in one update. That sounds reckless, but when 163 data points all point the same direction, it's not a gamble — it's overdue.
Entry price capped at $0.25. Eleven cities had their entry threshold lowered. The data was clear: above $0.25, you're paying too much for shares that don't resolve in your favor often enough. One exception — Miami stayed at $0.30 because it had a 62% win rate at higher prices (5 out of 8 trades), the only city where expensive entries actually worked.
Minimum buy hours set to 36. Nothing bought closer than 36 hours to peak. The 36-48h window was the golden zone. Everything below it was losing money.
Dynamic bias removed. This one surprised me. The dynamic bias system from Episode 12 — adjusting today's forecast based on yesterday's error — was supposed to make the bot learn. Instead, it was amplifying noise.
# v2.8: "learning" from yesterday
adjusted = model_max + (
fixed_bias * 0.7 + yesterday_error * 0.3
)
# v2.9: just use the fixed correction
adjusted = model_max + fixed_bias
The numbers: fixed bias averaged 1.69° error with 62% accuracy. Dynamic bias averaged 2.03° error with 55% accuracy. The "learning" was making predictions worse, not better. Yesterday's temperature error turned out to be a terrible predictor of today's error.
Models swapped for 13 cities. With 20+ data points per city now, I recalculated which model actually performed best using raw bias (actual minus model prediction). Six cities changed in Episode 11. Now thirteen more flipped. Ankara went from GFS to AIFS. Tokyo from GFS to JMA. Miami from ICON to AIFS. Wellington from MEDIAN to ECMWF at 100% accuracy.
The other four changes were smaller: three cities deactivated (Denver 33% accuracy, Seattle 28%, Toronto 43%), all positions capped at one bucket per city-date (two-bucket trades had -21.9% ROI from Episode 10), the forecast-change entry logic removed entirely (15 triggers, 0 fills — dead code), and fixed bias values recalculated for every active city.
The Deployment That Went Wrong
I pushed v2.9 on April 8 and waited for the Telegram reports. First scan: zero opportunities. Second scan: zero. Third: zero. An entire day passed with exactly one trade.
The 36-hour minimum was killing the bot.
Here's what I hadn't considered: the 36-48h window is only 12 hours wide. And it opens at a different time for each city depending on when peak temperature hits. For Asian cities with UTC 5-6 peaks, the window opened during my bot's active scanning hours. For European cities (UTC 13-14 peak), the 48-hour mark fell outside the scan window. For US cities (UTC 19-22), most of the golden zone happened when the previous day's markets hadn't opened yet.
One trade per day across 21 cities. The math was right but the strategy was operationally dead.
The Hotfix
I pulled back to 30 hours minimum. That widened the window from 12 hours to 18 hours, catching more cities during their optimal buy periods. Not as clean as the 36-48h golden zone, but it brought daily trades back to a workable number.
To track whether this compromise costs me, I added htp (hours to peak) to every position in positions.json. Now I can separately analyze 30-36h trades versus 36-48h trades and see if the wider window is actually hurting or just adding volume.
# Track when each trade was placed
"htp": round(
(peak_dt - now_utc).total_seconds() / 3600,
1
)
The lesson was one I keep relearning: backtested optimal values and operational reality aren't the same thing. Data analysis tells you what to do. Actually running the bot tells you what's possible to do. The gap between the two is where most strategies die.
What's Coming
v2.9 is running. The filters are tighter, the models are updated, the dead code is gone. Whether the 30-36h compromise holds up or costs me — that's what the next few weeks of data will answer.
But there's a bigger change on the horizon. Polymarket announced a V2 infrastructure overhaul on April 6 — new order structure, new token format (USDC.e → Polymarket USD), and a full orderbook reset. Every GTC order gets cancelled during migration. The Python SDK will need an update. My bot will need to pause and adapt.
That's Episode 15.
Key Takeaways
- 163 trades, $1 profit. The bot was technically profitable but barely surviving at 23.3% win rate vs 22.2% breakeven.
- The golden zone: 36-48h before peak + entry under $0.25 = 32% win rate, +44% ROI. Same data, just filtered.
- Dynamic bias made predictions worse. Fixed bias: 62% accuracy. Dynamic: 55%. Yesterday's error is noise, not signal.
- Backtested optimum (36h minimum) killed daily trade volume to 1. Pulled back to 30h. Theory vs operation is a real gap.
← Previous: Episode 13: The Fee Change, the Ghost Trade, and Where Things Stand Now Next: Episode 15 (coming soon) →
More updates on the way. If you're working on something similar or found a smarter way to do it, drop it in the comments — the more we share, the faster we all move.
Disclaimer: This blog documents my personal learning journey. Nothing here is financial advice.
Comments
Post a Comment