What Does Google’s Gemma Really Mean to the AI Market?
Without taking a breath from its recent Gemini AI updates, Google has announced the release of a new family of open language models called Gemma.
Gemma is a text-to-text model built with the same research used to create the tech giant’s flagship Gemini models and is available as a 2B or 7B parameter version.
At its core, Google’s decision to release an open model is an attempt to capitalize on the reach of the open-source community, just as Meta did in 2023 when it released the LLaMA large language model (LLM), which has since been downloaded 30 million times and featured in over 3,500 enterprise projects.
Key Takeaways
Google’s announcement post said: “Gemma models share technical and infrastructure components with Gemini, our largest and most capable AI model widely available today.
“This enables Gemma 2B and 7B to achieve best-in-class performance for their sizes compared to other open models. And Gemma models are capable of running directly on a developer laptop or desktop computer.”
The models were trained on a dataset composed of 6 trillion tokens of text, including web documents, code, and mathematics, and, according to Google, surpassed the performance of larger models like Llama 2 on text generation tasks, including question answering, summarization, and reasoning.?
This release came less than a week after the release of Google’s Gemini 1.5 – and less than three months since the launch of the Gemini family of LLMs.?
How Does Gemma Fit into the World of LLMs?
Demis Hassabis, co-founder and CEO of Google DeepMind, said: “We have a long history of supporting responsible open source and& science, which can drive rapid research progress, so we’re proud to release Gemma: a set of lightweight open models, best-in-class for their size, inspired by the same tech used for Gemini.”
That being said, it’s important to note that Google hasn’t made the model fully open source, only releasing the parameters but not the source code and training data.?
From a glance, Gemma is differentiated from Gemini because it is a text-to-text model rather than a multimodal model that can handle inputs in text, voice, and images.?
It’s also more computationally lightweight, meaning that it can be run on a laptop, workstation, or cloud environment like Google Cloud via Vertex AI and Google Kubernetes Engine. This makes it better suited to on-device applications than Gemini.?
Gemma vs Llama 2 and Mistral 7B
When it comes to Gemma’s place in the open-source community, there are two main competitors: Llama 2 and Mistral 7B. Each of these models has developed a reputation as one of the highest-performance open-source LLMs.
However, research released by Google shows that Gemma outperforms each model in critical areas like question answering, reasoning, math, and coding tasks.?
We have included some of the test results below:?
Benchmark? | Gemma 7B? | Mistral 7B? | Llama 2 7B? | Llama 2 13B? |
MMLU (General)? | 64.3? | 62.5? | 45.3? | 54.8? |
BBH (Multi-step reasoning tasks)? | 55.1? | 56.1? | 32.6? | 39.4? |
HellaSwag (Commonsense reasoning)? | 81.2? | 81.0? | 77.2? | 80.7? |
GSM8K (Basic arithmetic? and grade school math problems)?? | 46.4? | 35.4? | 14.6? | 28.7? |
MATH (Challenging math problems, algebra geometry, pre-calculus)? | 24.3? | 12.7? | 2.5? | 3.9 |
HumanEval (Python code generation)? | 32.3? | 26.2? | 12.8? | 18.3? |
Gemma’s results were solid across the board but were particularly impressive on coding and mathematics tasks, where the model scored significantly above both Mistral 7B and Llama 2.?
While it’s not as powerful as LLMs like GPT-4 or Gemini, it doesn’t need to be – it has succeeded at providing a lightweight, computationally efficient, and high-performance model that researchers can experiment with on their laptops without needing to maintain an entire data center filled with costly servers.?
Responsible AI And Potential Challenges
The decision to release an open model isn’t without its challenges. After all, not only will researchers be free to experiment with Gemma for legitimate use cases, but they’ll also have an opportunity to misuse it too.?
As a result, it’s possible that the model could be used to generate misinformation and hateful or harmful content (although this isn’t a risk that’s limited to open LLMs alone).
One notable study conducted by MIT researchers outlined how a “version of the Llama 2 70B model nicknamed “Spicy” could be used to gather information on how to obtain and release the 1918 influenza virus.”
The study argued that “once the model code and weights are made public, it becomes near impossible to prevent actors from fine-tuning, either to remove safeguards or to enhance specific technical knowledge in a way which renders that knowledge more easily utilized by laypersons.”?
Other commentators are also warning about the risks of open-source AI. Melissa Ruzzi, AppOmni’s Director of Artificial Intelligence told Techopedia, that:
“Open source AI models sound like a great idea, especially as powerful as Gemma can be, as it is developed based on Gemini. But they can also empower bad actors, as evidenced by research showing nation-state cybercriminals are using AI in attacks and threat actors exploring how AI can help them improve productivity.
“It’s nearly impossible to implement enough controls to keep this from happening and still have a functional model. This is the biggest problem to solve with open-source AI models.”
In any case, to help prevent misuse, Google has used a mix of CSAM filtering, sensitive data filtering, and content quality filtering to remove harmful/illegal content, personal information, or any text that could violate the organization’s content moderation policies.
Whether these safeguards are sufficient to prevent misuse remains to be seen.?
The Bottom Line
The release of Gemma is deepening Google’s AI product ecosystem, but the true winner here is the open-source community. Researchers now have their pick of Gemma, Llama 2, and Mistral 7B to experiment with to develop new solutions.
As this open-source ecosystem matures, we’re likely to see more and more powerful LLMs developed, closing the gap between open and closed-source LLMs.
As ever, whether these are put to good or bad uses is a decision that, at least for now, is in the hands of the operator.