Google Labs has announced a major upgrade to its Gemini Pro AI tool—the midsize AI model that powers the free version of its chatbot—introducing the ability to process up to 1 million tokens in preview. It provides unprecedented "context size" that leaves the current leading tools and their 128K capacity in the dust.
The upgrade to Gemini Pro v1.5 makes it theoretically 700% more powerful than OpenAI’s paid GPT-4 model, and sets a new benchmark for computational linguistics and machine learning among large language models (LLMs).
The figure is “the longest context window of any large-scale foundation model.” according to Google.
“Before today, the largest context window in the world for a publicly available large language model was 200,000 tokens. We’ve been able to significantly increase this—running up to 1 million tokens consistently,” the Google Labs team shared.
With this feature, Gemini Pro would be more capable than the most powerful version of the current Gemini lineup—and any other LLM currently available. However, this context was brought online for testing purposes, while Gemini Pro’s upcoming stable version will handle up to 128K tokens.
While that release will be a major upgrade over the 32,000 tokens that Gemini 1.0 can process, users will have to wait to see what 1 million tokens can do.
The move is Google’s latest offensive in the race to dominate the AI industry. Last week, Gemini Advanced became the first credible competitor to ChatGPT Plus. Unlike Anthropic's Claude, Google’s chatbot is multimodal, provides good results in different tests, and offers a set of features that OpenAI doesn't.
Gemini Advanced will, however, be catching up to GPT-4.5 Turbo, which already handles 128,000 tokens.
The versatility of Gemini 1.5 was vividly showcased through several demonstrations. Google said it “can process vast amounts of information in one go—including 1 hour of video, 11 hours of audio, codebases with over 30,000 lines of code or over 700,000 words.”
“In our research, we’ve also successfully tested up to 10 million tokens.” the team added.
One shortcoming: Gemini models cannot analyze PDF files, a flaw that Decrypt pointed out in its comparison between Gemini and ChatGPT.
‘Mixture of Experts’ is here to stay
Another difference between Gemini 1.5 and its previous versions is the use of Mixture of Experts, the same technology that Mistral AI used to make its more lightweight model. Mistral's entrant was powerful enough to beat GPT 3.5 and leapfrog to the upper echelons of best open-source LLMs.
"(Mixture of Experts) routes your request to a group of smaller ‘expert’ neural networks so responses are faster and higher quality," Google shared in its announcement, saying it this ensures that responses are not just faster but also higher quality.
Just like Mistral, Google was able to make its model shine. Gemini 1.5 Pro showed superior performance in several benchmarks compared to Gemini Ultra 1.0, suggesting a promising future for Google's LLMs.
“It shows dramatic improvements across a number of dimensions and 1.5 Pro achieves comparable quality to 1.0 Ultra, while using less compute,” Google CEO Sundar Pichai said in a blog post today.
The announcement did not provide a timeline for the release of Gemini Advanced 1.5. Meanwhile, OpenAI is actively developing GPT-5. Gemini's enhanced token-handling capabilities will help fortify Google's position in the AI arms race.
Edited by Ryan Ozawa.