The Gemini family tree

Google Cloud

Welcome to the new way to cloud.

Published Jun 9, 2024

Last month at Google I/O, we introduced Gemini 1.5 Flash, the latest model in the growing Gemini family. We asked Hamidou Dia, vice president for applied engineering at Google Cloud, to explain a bit about all the different models that now belong to the Gemini family tree, when and where to use them, and why Gemini stands out from other AIs. (This post originally appeared as part of Google Cloud’s monthly executive insights email newsletter — which you can sign-up for here.)

The Gemini family is a big one, and it just keeps growing. And like any family, each member has its own strengths and personalities. Gemini 1.5 Flash is the newest of the bunch, and one of our most capable offerings yet. What’s so special about Flash and all its relatives? What makes each of them — Gemini 1.5 Pro, Gemini 1.0 Nano, Gemini 1.0 Pro, and Gemini 1.0 Ultra, as well as their cousin Gemma, the open model — different?

Or, what you’re really wondering: Which of them is right for your business or specific applications?

Rarely are any two AI use cases the same, and those use cases keep growing in number and maturity each day. It takes a wide range of models to satisfy these different needs, and that might even include another family of models altogether, like Anthropic’s Claude or the open source Mistral models. This diversity of needs is why Google Cloud has taken a truly open approach since day one for our model offerings and capabilities, highlighted by Vertex AI’s Model Garden and its selection of more than 150 first-party, third-party, and open models.

One of the most important considerations across the latest Gemini models, and what sets them apart from the competition, is their long context window. When we announced Gemini 1.5 Pro in February, it was the first widely available model with not only a context window of 1 million tokens, but also near-perfect recall across large amounts of input data. At I/O, Sundar Pichai, Google’s CEO, revealed that that context would expand to 2 million tokens. He even remarked that this was part of the pathway to “infinite context.”

Want to try Gemini 1.5 Pro for yourself? Check it out now in the Google Cloud console.

A token is fundamentally the smallest segment that a piece of data can be broken down into for use in a particular model. This could be thought of as a letter or character, but depending on the configuration of both the model and the data, these tokens could be as large as a word or phrase. The larger the context window, the more a model can process and compare information without “forgetting” what has already been processed or prompted.

If your context window only covers a few thousand tokens, maybe the model could understand a single whitepaper or a few emails. When it gets into the millions, that’s enough processing power to understand and analyze entire books or movies or, more practically for the enterprise, entire codebases, large financial datasets and research reports, or hours of footage from a manufacturing floor and a shelf’s worth of production manuals.

That’s where things really get interesting, when you start to combine some of these materials. The other important aspect of Gemini is that all the models are natively multimodal. Previous generations of models could maybe identify an image or video while also deciphering text or code, but that was basically shuttling the information between a set of sub-models. Gemini was developed from the start to handle a range of information types, just as a person normally would.

This means less latency and energy usage and better results for queries involving multiple sources and types of information. A manufacturing company, for example, could upload those manuals and potentially use them to spot dangers or inefficiencies in the factory footage by seamlessly cross-referencing the two. Or an investment firm could upload an investor call, regulatory filings, and references to social media and combine them for investment insights.

This is where the family of models becomes so important. For the most lightweight application on a mobile phone or edge device, there’s Gemini 1.0 Nano. Gemini 1.0 Pro is the mid-weight model with a context window and features optimized for common tasks and scale, while Gemini 1.0 Ultra tackles more complex and demanding tasks. Our Gemini 1.5 models step up with context windows of 1-million+ tokens and native multi-modal reasoning. Gemini 1.5 Flash — which offers our best combination of long context capabilities, advanced analysis, and low latency — will now serve most enterprise applications, though there are some of the most advanced needs that will require the full power of Gemini 1.5 Pro. And for those who need an open model for greater flexibility or access, Gemma, our family of open models, is at the ready.

It’s a big family, ready to get to work.

Speaking of the capabilities of our models, underlying infrastructure, and enterprise tooling in Vertex AI Platform, we’re excited to share that Google was named a Leader in The Forrester Wave™: AI Foundation Models for Language, Q2 2024. Google received the highest scores of all vendors evaluated in the Current Offering and Strategy categories, with Forrester noting:

“Gemini is uniquely differentiated in the market especially in multimodality and context length while also ensuring interconnectivity with the broader ecosystem of complementary cloud services.”

Transformation Today

985,428 followers

+ Subscribe

Adam Hiber

Consultant | Entrepreneur | Sr. IT Project Manager | Senior IT/Cybersecurity/Cloud Business Analyst |

Google Cloud's latest AI model, Gemini 1.5 Flash, is now part of the expanding Gemini family. Each model, from Gemini 1.0 Nano to Gemini 1.5 Pro, offers unique strengths for diverse AI applications. The standout feature of these models is their long context window, now reaching up to 2 million tokens, allowing them to process and recall vast amounts of data efficiently. Additionally, their native multimodal capabilities ensure seamless integration of various data types for superior performance.

Imran Arain

amazing contratulations

KIMBERLY NEU

Strategic Enterprise Territory Executive at Google

Great explanation of the Gemini Family!

1 Reaction

Ben Thomas

We do not all have Business Addresses to fill the mandatory Field to request a r download a complimentary copy of the full report.

Vidura Chandrasekara

Good point! 👍

See more comments

To view or add a comment, sign in

The Gemini family tree

Google Cloud

Welcome to the new way to cloud.

Transformation Today

985,428 followers

More articles by this author

Insights from the community

Others also viewed

Amazon to Sell Predictions in Cloud Race Against Google and Microsoft

re:Invent 2017

AWS Wavelength & Telcos

re:Invent 2018 - t-minus one month

How BreezoMeter is using Google Cloud Platform to calculate air quality levels worldwide

Azure Advent Calendar 2023

The view from Google Next

2023 Public Launches - AI Interaction

Next Up: Google Cloud Platform

LightOn Cloud: Light Based Technology for AI on the Cloud

Explore topics

Transformation Today

985,428 followers

AI for marketing, from hype to how

Jun 26, 2024

CISOs should be asking—and answering—these AI questions

May 7, 2024

101 leading orgs share their gen AI use cases

Apr 19, 2024

Welcome to Google Cloud Next ‘24

Apr 9, 2024

The Journey to Google Cloud Next ‘24

Apr 7, 2024

Why you can't have good gen AI without good data

Mar 21, 2024

Gen AI governance: 10 tips to level up your AI program

Mar 8, 2024

Can AI and cloud help the planet and your bottom line?

Feb 14, 2024

Our CTO's 2024 predictions: Three big things for gen AI

Jan 12, 2024

Multimodal AI models are bound to change everything

Dec 20, 2023

Insights from the community

Others also viewed

Amazon to Sell Predictions in Cloud Race Against Google and Microsoft

re:Invent 2017

AWS Wavelength & Telcos

re:Invent 2018 - t-minus one month

How BreezoMeter is using Google Cloud Platform to calculate air quality levels worldwide

Azure Advent Calendar 2023

The view from Google Next

2023 Public Launches - AI Interaction

Next Up: Google Cloud Platform

LightOn Cloud: Light Based Technology for AI on the Cloud

Explore topics