Weather Bot — Episode 3: I Lost 7 Straight Weather Trades on Polymarket — Every Mistake Exposed
I almost didn't write this one.
After building what I thought was a solid system — three weather models, a clean forecasting pipeline, confidence filters — I ran my first week of DRY RUN. Paper trading, no real money at risk. Seven verified trades.
The result: 0-7. Every single trade lost.
And it wasn't random. Every actual temperature came in higher than my forecast. Not by random amounts. By a consistent, painful margin. Something was systematically wrong, and I had no idea what.
What This Post Covers
The full damage report from my first week of weather trading on Polymarket. Every trade, every number, every wrong assumption. If you're thinking about building a prediction market bot, this is the episode that might save you from making the same mistakes I did — or at least convince you to paper trade first.
The First Three Trades: March 5
I remember refreshing the Telegram bot report that morning, feeling pretty confident. Three cities, three forecasts, all based on the average of GFS, ECMWF, and ICON.
Ankara: My forecast said 10.2°C. The actual temperature at Esenboğa Airport came in at 11.1°C. Off by 0.9°C. Not terrible by weather forecasting standards, honestly. But I'd bet on the 10°C bucket, and the answer was 11°C. Wrong bucket. Zero payout.
Seoul: Forecast 7.0°C, actual 9.4°C. Off by 2.4°C. That's not even close. I'd bet on 7°C and the real answer was 9°C. Two whole buckets away.
Seattle: Forecast 49.4°F, actual 51°F. Another miss, another wrong bucket.
Three trades, three losses. I told myself it was a bad day. Weather is variable. Tomorrow would be better.
March 7: Seoul Again
Forecast 3.3°C for Seoul. Actual: 5.0°C. Off by 1.7°C, same direction as before — actual higher than forecast.
Four for four now, all misses, all in the same direction. I was starting to notice the pattern but still wasn't sure if it was real or just a small sample.
March 10: The Full Picture
Three more trades. Ankara: forecast 9.4°C, actual 10.0°C, off by 0.7°C — my closest miss yet, but still wrong bucket. Seoul: forecast 5.0°C, actual 7.2°C, off by 2.2°C. Same story, same direction.
And then Wellington: forecast 15.6°C, actual 16.1°C. Technically this one hit — the 16°C bucket resolved correctly because the model maximum happened to be 16.0°C. One win out of seven.
But 1/7 is a 14% win rate. At that rate, I'd burn through my $198 in a couple of weeks.
What Was Actually Going Wrong
I spent two days staring at the numbers after that week. A few things became clear, and none of them were what I expected.
Averaging the models was pulling me down. My approach was simple: take GFS, ECMWF, and ICON, average them, use that as my forecast. It felt like the responsible thing to do. More data, less noise, right?
Look at Seoul on March 10. GFS predicted 4.1°C. ECMWF predicted 5.8°C. ICON predicted 4.9°C. My average came out to 4.9°C, so I bet on the 5°C bucket. The actual temperature was 7.2°C.
ECMWF was the closest at 5.8°C — still wrong, but in the right neighborhood. GFS and ICON dragged the average down by almost a full degree. And this pattern repeated across nearly every trade. ECMWF ran high, the other two ran low, and the average split the difference in the wrong direction.
Being "close" means absolutely nothing. This was the hardest one to accept. Ankara on March 5: my forecast was 10.2°C, the actual was 11.1°C. In weather forecasting, 0.9°C error is genuinely good. Meteorologists would be happy with that.
But on Polymarket, 10.2°C goes in the 10°C bucket and 11.1°C goes in the 11°C bucket. Different bucket, total loss. The market doesn't care how close you were. You're either in the right bucket or you're not. There's no "almost" payout.
This is the thing I wish someone had told me before I started. Forecast accuracy and prediction market win rate are completely different metrics. You can have excellent forecasts and still lose every trade.
The EV formula looked great on paper. In my earlier bot versions (v1.0 through v1.3), I was using expected value calculations to pick trades. The math was clean: EV = P(win) × payout - P(lose) × cost. If EV was positive, buy.
Problem was, the formula kept flagging cheap tail bets. A bucket priced at $0.04 might show a calculated EV of +15%, but it was priced at $0.04 for a reason — it wasn't going to hit. My probability estimates in the tails were garbage. I was buying lottery tickets and calling it a "positive expected value strategy."
Every error went the same direction. All seven actual temperatures were higher than my forecast. Seven out of seven. That's not bad luck, that's systematic bias. Something about either the models, the API, or the specific airport stations was consistently producing readings higher than predicted.
I didn't have enough data to pin down exactly why. Maybe Open-Meteo's post-processing differs from raw model output. Maybe airport tarmac creates micro-climate heating effects the models don't capture. Whatever the cause, the data was screaming: your forecasts are too low.
The One Win That Pointed Forward
Wellington on March 10 was the only trade that worked. And looking at it closely, the reason was obvious.
The three models predicted 16.0, 15.7, and 15.1°C. The spread was only 0.9°C — all three basically agreed. And the highest model value (16.0°C) was almost exactly right. Actual: 16.1°C, which resolves as 16°C.
If I'd used the model maximum instead of the average for every trade, I'd have been closer on most of them. Not perfect, but closer. Wellington was the proof of concept sitting right there in my own failure data.
Where I Stood After Week One
Seven trades, one accidental win, $0 of real money lost (thank god for DRY RUN), and a growing spreadsheet of numbers that all said the same thing: the averaging approach doesn't work, the EV formula doesn't work, and whatever I do next needs to account for the fact that actual temperatures consistently run higher than what these models predict.
I had two options. Give up, or change the approach entirely. Not tweak it — throw out the core assumptions and start over.
I started looking at what the traders who were actually making money — gopfan2, meropi, 1pixel — were doing differently. Not what they said in interviews or Twitter threads. What their on-chain trading patterns actually showed.
Key Takeaways
- 0-7 in the first week. Every actual temperature came in higher than the forecast. Systematic, not random.
- Model averaging pulls your prediction down when one model (ECMWF) is consistently more accurate than the others. The model maximum was closer in almost every case.
- Forecast accuracy ≠ win rate. Being off by 0.9°C is great meteorology but it's a total loss on Polymarket if it lands you in the wrong bucket.
- EV calculations are only as good as your probability estimates. In the tails, my estimates were basically random.
What's Next
In Episode 4, I'll get into what I found when I studied the profitable traders' actual patterns — and the embarrassingly simple set of rules that replaced all my complex math.
← Previous: Episode 2: GFS, ECMWF, ICON Trading Signals Next: Episode 4: Why Simple Price Rules Beat →
More updates on the way. If you're working on something similar or found a smarter way to do it, drop it in the comments — the more we share, the faster we all move.
Disclaimer: This blog documents my personal learning journey. Nothing here is financial advice.

Comments
Post a Comment