We Asked 7 AI Agents to Predict the 2026 World Cup: Here's What They Said

Can AI help predict the winner of the 2026 World Cup? We put seven of the best models to the test.

12 min read

Jun 8, 2026

The 2026 World Cup kicks off in days, which means half the planet is about to pretend it can predict the future.

Everybody's got a take. Your group chat has one. Your fútbol-obsessed coworker has one. And this year, so does the smartest software ever built.

AI has, perhaps not so quietly, turned into our go-to oracle. We let these models write our emails, debug our code, plan our holidays and diagnose the 3 a.m. rash—so of course we also ask them who lifts the trophy. They'll crunch the squads, weigh the form, and hand you a champion with a certainty the rest of us can only fake.

I've pulled this party trick before—an AI dream team on my March Madness bracket (which sucked), a homemade HorseGPT on the Kentucky Derby (which was actually kind of good). Equal parts genuinely useful and deeply humbling.

So with the biggest tournament on Earth almost here, we ran it back—bigger than ever.

We created Hermes agents, configured them with access to statistics sites (the free ones, not the ones that cost one kidney per month to use), set them up with custom skills and handed seven of the world's most advanced AI models the same job: forecast the 2026 World Cup, champion down to the also-rans, and show their work. Each got the real draw—48 teams, 12 groups, the full bracket—and total freedom on how to crack it.

Then we sat back and let them argue.

Four picked Spain. Three picked Argentina. And the line between them turned out to be less about football than about which numbers each machine chose to trust.

Here's what all seven said—pick your side.

Opus 4.8 Max — The Meteorologist

Pick: Spain. 20% / Dixon-Coles Poisson + Monte-Carlo bracket · final: Spain def. France

Anthropic's Opus 4.8 Max treated the World Cup like a physics problem. It took each team's Elo rating, turned the gaps into expected goals with a Dixon-Coles model—the kind bookmakers actually use—and simulated the bracket thousands of times. Spain came out the champion at 20%, past France in the final, with Portugal and England beaten in the semis.

Its real obsession, though, was everything happening off the ball. Opus was the only model in the field to price in the conditions a spreadsheet usually ignores—heat, thin mountain air, and continent-sized travel.

It flagged that roughly five matches fall in heat severe enough that players' performances may be affected, and that visiting teams climbing to 2,200 meters at the Azteca tend to wilt in the final 20 minutes. It treated all of it as a quiet tax on the fitter, deeper European sides.

Then it did the coldest thing on the board and gutted Brazil. With Rodrygo's knee gone, Estêvão hurt and a 34-year-old Neymar dragged back for one last dance, Opus cut the five-time champions’ odds to 8%—half what the Argentina-leaning models gave them.

Its sharpest call was the quarterfinal it billed as "the real final, a round early": Spain over Argentina, a 39-year-old Messi pressed into the turf. For the Golden Boot it took Mbappé and barely blinked.

GPT 5.5 — The Careful Scout

Pick Spain 15–18% / Five weighted buckets, no simulation · final: Spain 2-1 France

OpenAI's GPT 5.5 didn't trust a single big number, so it built a scorecard instead. Every team got graded across five weighted columns—squad quality counted most at 35%, then tactical control, finishing, availability and the kindness of the draw. It kept the weights deliberately blunt to avoid kidding itself that football is more predictable than it is.

Spain came out on top, but only at 15–18% odds of winning, and it would not pretend to be more precise than that. "Ranges rather than fake precision," it wrote, projecting Spain to beat France 2-1 in a final it expected to be decided by a single goal or extra time.

What made it the scout was the legwork. GPT 5.5 cross-checked itself against Opta's 25,000-run supercomputer—which landed in nearly the same spot, Spain first at 16.1%—then went reading the Spanish sports press for things a model can't see.

It surfaced a training-ground scare in the Spain camp, a stray Gavi challenge that left Rodri on the floor, and the careful reintegration of Yamal and Nico Williams after muscle trouble. None of it moved the pick, but it lowered the confidence—exactly what a good scout does.

Its semifinal four were Spain, France, Brazil, and Argentina, and it was blunt about England: loaded, genuinely dangerous, and most likely stopped by France before the last four.

DeepSeek v4 Pro — The Maximalist

Pick Argentina 18% / Qualitative tiers · final: Argentina vs France

DeepSeek v4 Pro answered a simple question with a 5,000-word epic. It didn't just name winners; it built the entire Round of 32, annotated all 48 squads, and weighed travel down to the 4,500 kilometers between Vancouver and Miami. If the others wrote previews, DeepSeek wrote the operating manual.

All that detail led somewhere contrarian: Argentina, at a tournament-best 18%, edging out France for the trophy in a Messi-versus-Mbappé final in Miami-which is a hallucination: The final will take place at MetLife Stadium in New Jersey.

The case was old-fashioned—the champions have the calmest spine, the softest group, and a coach who has won tournaments knowing exactly how to ration a 39-year-old Messi.

Then it bet the entire forecast on one calf muscle. DeepSeek decided the title hinged on France's goalkeeper Mike Maignan and his March injury: "If Maignan plays, France are co-favorites; if not, the gap widens," it argued.

The wrinkle is that DeepSeek was reading an old map. It still had Gareth Southgate in the England dugout and Dorival Júnior managing Brazil—both gone in 2024—and leaned on outdated rankings throughout.

It was the most thorough analyst in the building, working from a slightly out-of-date dossier. Impressive and faintly haunted, like a detective who cracks the case using last year's phone book.

Stepfun 3.7 — The True Believer

Pick Spain 33% / Pure-Elo Monte Carlo, 50,000 sims · final: Spain vs Argentina

No model believed harder. Stepfun 3.7 ran 50,000 simulated tournaments and crowned Spain at a wild 33%—nearly double the conviction of anyone else, with Argentina a distant second at 15%.

But the best thing Stepfun did was fail in public. Its first attempt was a fancier model that tried to invent expected-goal numbers for national teams, and it produced nonsense—Mexico, South Africa, and South Korea came out as top-three favorites to win the World Cup.

Rather than bury that, Stepfun explained the whole misadventure, worked out that the made-up stats had flattened the real gulf between good teams and great ones, then scrapped it and rebuilt on raw Elo alone. The new version was simpler, blunter, and far more sensible.

The trade-off is that pure Elo is blind to anything human. Stepfun's Spain doesn't know Lamine Yamal has a hamstring injury, doesn't assess heat or travel, and treats a penalty shootout as a coin weighted by rating. It's a beautifully honest machine that has never once watched a game of football.

Its bracket marched to the obvious places—Spain past Argentina in one semi, the hosts and Brazil gone earlier—and planted its flag: Spain, comfortably, a third of the time. The most confident pick on the board, and the most upfront about why you shouldn't completely trust it.

By the way, the agent mixing Spanish and English in the same reply was a behavior proven to be pretty hard to steer away from with this model. This agent was a polyglot and switched between English, Spanish and Portuguese throughout the whole session. That happens when your agent learns you speak whatever language is easier at any given moment.

Nemotron 3 Ultra — The Double-Checker

Pick Spain 18–22% / Bivariate Poisson + a subjective twin · final: Spain vs Argentina

Nvidia's Nemotron 3 Ultra didn't trust itself, so it ran the tournament twice. The first pass was a cold simulation, a bivariate-Poisson model grinding through 5,000 brackets. The second threw the math out and scored teams by hand—squad, tactics, form, the manager, even "mystique"—to see whether a human-style read would disagree.

It didn't. Both versions crowned Spain, at 18% and 22% odds, about as close to a second opinion as one model can give you.

Nemotron also did the most homework on the actual football. It arrived with formations, pressing intensity and expected-goal rates for team after team, in two languages, reading less like a forecast than a coach's dossier.

That depth produced the spiciest take of the experiment. Nemotron had Türkiye—not the host United States—winning the wide-open Group D, with the Americans finishing dead last while everyone else waved them through; it also rated Ecuador's miserly defense a notch above Germany.

When the dust cleared it lined up the heavyweight semis half the planet expects, Spain–France and Argentina–Brazil, and sent Spain through to lift it. A model that argued with itself, did extra reading, and still landed on the favorite is trying to tell you something.

MiniMax 2.7 — The Self-Auditor

Pick Argentina 18% / Qualitative, self-audited · final: Argentina vs France, no scoreline

MiniMax 2.7 picked Argentina at 18% odds, a hair ahead of France—and then spent its closing pages grading its own work. Most models hide their uncertainty; MiniMax printed a running list of corrections, openly walking back things it had gotten wrong earlier in the very same report.

The receipts are a delight. It caught itself repeating a bogus stat about South American champions, fixed Uruguay's coaching situation, corrected Kai Havertz's position to match his actual club role, and slapped an "unconfirmed" on both Haaland's fitness and Ronaldo's selection rather than wave them through.

It policed its own hype, too. MiniMax deleted a tempting Messi-versus-Ronaldo semifinal once it realized the pairing was impossible—the two are in opposite halves and can only meet in the final—and stripped out the invented scorelines other models happily printed.

Then, at the decisive moment, it simply declined to guess. Argentina against France, MiniMax wrote, is "a genuine 50/50," and it would not manufacture a winner it didn't have.

In a field of supremely confident robots, the restraint landed. MiniMax was the one that kept saying, in writing, here is exactly what I don't know—which is somehow more trustworthy than a tidy prediction.

Qwen 3.5 — The Contrarian With Receipts

Pick Argentina 22% / Research-only, no sims · final: Argentina 2-1 Spain

Qwen 3.5—a 397-billion-parameter model—was the most evidence-obsessed of the lot and, somehow, the biggest rebel. It refused to run simulations at all, sorting every statement into "verified facts," "estimates" and "forecasts," and stamping its overall confidence, in its own capital letters, as LOW.

Then it went rogue. Qwen had Argentina beating Spain 2-1, with Spain stranded down in fifth at just 10%—the only model that didn't even put La Roja on the podium.

The reason was the ruler it grabbed. The Spain camp used the live football Elo that ranks Spain first in the world; Qwen reached for a club-based rating that slotted Argentina, Brazil, France, and England all ahead of it. This switches perspectives, suddenly generating a different favorite.

Its case for Argentina was all texture—champions' muscle memory, Messi chasing a perfect ending, and one stat it leaned on hard: at the last World Cup, teams that saw less of the ball won 38% of knockout games. Organized and ruthless beats pretty and possession-heavy, it argued.

There was a price for all that diligence. The most fact-proud model also fumbled the basics, sliding Scotland into the wrong group and double-booking tiny Curaçao into two of them.

Where they actually agree

Step back and the seven AI models fight less about their predictions than it looks. Every single model put Spain, Argentina, and France in its top tier, named almost identical group winners—Brazil, England, Portugal, Germany, Belgium—and flagged the same wildcards: Haaland's fitness, Messi's age at 39, and a Group D nobody could call.

The fault line was the data, not the football. The four that trusted live football Elo, where Spain sits clearly first, picked Spain. The three that leaned on FIFA's ranking, a different Elo source, or raw 2022 pedigree, drifted to Argentina. Feed a model a different number one, and it hands you a different champion.

What the humans with money on the line think

The crowd sides with the plurality. On Myriad, the prediction market run by Decrypt's parent company Dastan, Spain is the outright favorite at 19%, with France right behind at 17%, as of Sunday.

After that, the humans get stingier with Argentina than the bots do. Bettors price the defending champions at just 10% odds of winning—level with Brazil, behind England and Portugal at 12%, and less than half the 22% Qwen handed them.

For what it’s worth, predictors on Myriad are similarly undecided on the Group D winner, with the odds split on Turkey and the United States, even at 45%.

You can view the live odds on Myriad for every single match of the World Cup here.

So who wins?

None of this is a crystal ball, and all seven AI models said so out loud. The best single-match football models are right barely more than half the time, which is why even Stepfun's bullish 33% still means Spain falls short two times out of three.

The format only widens the odds: 48 teams, 104 matches, three countries, real heat and real altitude. Italy, four-time champions, didn't even qualify.

Besides the usual hallucinations when models want to be creative in their analyses, there may also be some confirmation bias. Remember it was a human who set these agents up. The prompt, the interaction, the configuration, the ideas for research and sources, all were influenced by the agent’s architect. Maybe, if all these elements point to Spain, all agents will reach a similar conclusion. That said, leaving a model in the wild and simply asking it “Who will win the World Cup” is not going to do a better job.

So take the seven robots the way I take my own bracket—a great way to start a fight at the bar, not a reason to remortgage the house and bet it all.

Four machines say Spain. Three say Argentina. The beautiful game, which has never once relied on an AI-written report, will do exactly as it pleases.

Get crypto news straight to your inbox--

sign up for the Decrypt Daily below. (It’s free).

Get Email!

Bitcoin ETFs Shed $465M Over Two Days, Led by BlackRock's IBIT

U.S. spot Bitcoin ETFs posted net outflows for a second straight day on Friday, reversing a seven-day inflow streak, according to data from Farside Investors. The funds bled $240 million on Friday, after $225 million the day before, erasing nearly half of the roughly $1 billion they had taken in during the preceding seven-day streak. Inflows had built to a $227 million peak on July 20, the streak's biggest day, before demand faded. BlackRock's IBIT drove the turn, accounting for just under $415...

BitMart Becomes Latest Crypto Exchange to Shut Down

Crypto exchange BitMart said Sunday it will wind down its trading platform after nine years, sending its BMX token crashing and marking the second major crypto exchange to announce its closure in a week. The exchange stopped taking new registrations, deposits, and orders from 01:30 UTC on Sunday. All spot and derivatives trading ends August 26, and the platform formally shuts on January 31, 2027. Important Notice After a careful evaluation of the Company's operating conditions, market environme...

Thailand's SEC Files Criminal Complaint Against Bitkub Over Undisclosed $47M Hack

Thailand's Securities and Exchange Commission has filed a criminal complaint against crypto exchange Bitkub and two of its former directors, alleging they submitted false reports to the regulator after a 2021 hack. According to reports in local media, the complaint, lodged with Thailand's Economic Crime Suppression Division, stems from a May 2021 cyberattack in which 16 types of digital assets worth 1.7 billion baht ($47 million) were drained from the exchange. From May to October 2021, the SEC...

News

Courses

Deep Dives

Coins

Videos