March Madness is upon us, with the western sports world obsessed with analyzing and picking how the NCAA men's basketball tournament will unfold—a task that the sports league says has odds of one in 9,223,372,036,854,775,808. Of course, the 1:9.2 quintillion odds assume a 50-50 probability for each matchup, no better than a coin flip.

Each team and game brings countless variables, and that's what makes March Madness predictions so engrossing for fans (and oddsmakers). It's a lot of information for humans to process, but how about artificial intelligence?

Beating the odds

After the first eight games, fewer than 2.5% brackets remain perfect, the NCAA says—the biggest "bracket buster" on day one was third-ranked Kentucky losing to #14 Oakland U, 80-76 on Thursday.

In fact, if every single person who has ever lived on Earth since the dawn of time filled out a different bracket, there would still be a 99.9999999883% chance that nobody would pick every winner. If you had drafted a unique bracket every second that has elapsed since the Big Bang, there would still be effectively zero likelihood of a perfect score.

Against that, you have a comparably great chance of winning a lottery.

Enter AI

Nobody has picked a perfect bracket so far. No human, that is. Machine learning has long been enlisted to aid in the endeavor, but now AI has advanced at an explosive rate.

Google—a company not typically keen on promoting betting or gambling—has been researching and sharing findings related to March Madness predictions for years. The tech giant has an entire section dedicated to teaching how to use Google Cloud to analyze NCAA basketball data. This allows folks to dig into various variables, compare game performance metrics, download comprehensive player and team stats, and more.

Google has even been an official partner of the NCAA men's basketball tournament, sharing data analysis tools for America's biggest annual sports spectacle as a use case for AI in big data analysis in 2020.

Picking an AI to pick winners

AI has since advanced at an explosive rate, and there are manifold ways to harness artificial intelligence for bracketology: you could process reams of data according to your own methodology, trust another's work, or even train your own model from scratch.

We opted for the lazier path and consulted ChatGPT. Among the many basketball-savvy GPTs available, we picked "March Madness Bracket Buster Bot" because it was the only one to show a real statistical analysis behind its predictions.

Spoiler alert, the bracket was busted after day one.

East:

  • Florida Atlantic vs. Northwestern: This could be tight, but I'm leaning toward Northwestern with their balanced play. Correct!
  • Illinois vs. Morehead St.: Illinois' seed suggests they're the team to beat in this one. Correct!
  • Washington St. vs. Drake: Washington St. seems poised to pull through with their strong seeding. Correct!
  • Iowa St. vs. South Dakota St.: Iowa St.'s seeding and stat line say they'll dance on. Correct!
  • BYU vs. Duquesne: BYU might just have the edge with their dynamic offense. Wrong.
  • UConn vs. Stetson: The power's definitely surging through UConn here, and I'd expect them to advance.
  • San Diego St. vs. UAB: San Diego St. has the defensive chops to shut down UAB.
  • Auburn vs. Yale: Auburn should dominate this academic showdown on the court.

West:

  • North Carolina vs. Wagner: North Carolina's history and stats say they'll advance. Correct!
  • Dayton vs. Nevada: Dayton's consistent play all season makes them the pick here. Correct!
  • Arizona vs. Long Beach St.: Arizona's ranking and stats make them the clear favorite. Correct!
  • Mississippi St. vs. Michigan St.: Mississippi St. has a good chance, but it’s going to be a nail-biter. Wrong.

South:

  • Texas Tech vs. NC State: Texas Tech's higher seed points to victory. Wrong.
  • Houston vs. Montana: Houston is the team to beat with their impressive seed.
  • Wisconsin vs. James Madison: Wisconsin is the favorite here, thanks to their powerful lineup.
  • Duke vs. Vermont: Duke is likely to show why they're a perennial powerhouse.

Midwest:

  • Gonzaga vs. McNeese St.: Gonzaga's pedigree in March Madness should see them through. Correct!
  • Kansas vs. Sanford: Kansas should have no trouble advancing past Sanford. Correct!
  • Creighton vs. Akron: Creighton's higher seeding should be the difference-maker. Correct!
  • Texas vs. Colorado St.: Texas' stats suggest they'll round up a win here. Correct!
  • Tennessee vs. Saint Peter's: Tennessee's impressive seed and stats make them the pick. Correct!
  • South Carolina vs. Oregon: This could be a toss-up, but South Carolina might just sneak by. Wrong.

Second, and third, opinions

Other AI models provided different results, though many of them agreed on certain outcomes, like BYU beating Duquesne—which didn't pan out yesterday. The unpredictability of humans prevails again. Duquesne certainly beat the odds scoring its first win in half a century, with many of ESPN’s stats favoring BYU by a large margin.

Image: ESPN

We also checked Kaggle.com, a platform for data science competitions and an online community of data scientists and machine learning practitioners, to see if there was something out there for "bracketologists"—and indeed there was. A machine learning model published by Nishaan Amin analyzed 153 MB of data files with dozens of charts and thousands of statistics and information. The repository has been checked over 4,000 times until today.

This lets you adapt the model and analyze what you consider important, fitting your style. The owner shared its own predictions.The model says UConn will win the tournament after beating Purdue, but it hints that Connecticut has a chance to beat Purdue before the finals.

The model also provides an upset bracket simulation with different parameters and results for those who like alternatives:

Of course, predicting human behavior—whether for bracketology, sabermetrics, or technical analysis—is an inexact science. Models analyze past performance and make predictions in "ceteris paribus" conditions: the most likely outcome if everything happens under the exact same circumstances as before.

A machine learning model can analyze huge amounts of data and find patterns that may be hidden before our eyes, understanding the relationship between hundreds or even thousands of variables. But no model or methodology can foresee extraordinary circumstances like an Elon Musk tweet tanking the whole crypto market, someone having an existential crisis and market-selling a huge crypto position, a team riding high off their fans' energy, a star player's untimely injury, or a hungover squad fresh off a massive rager.

So AI won’t solve the problems that only time machines can. In the end, that's what makes sports so compelling: the underdog stories, the stunning upsets, the unknown variables that defy the odds. We already failed on our first day after picking BYU. But sometimes David beats Goliath, slinging a stone to take down an armored giant. That probably happened in March.

Edited by Ryan Ozawa.

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.