News from Google – Gemma 2 is now available!

Introducing Gemma 2: Advanced AI for Everyone

Expanding Access to AI

AI has the potential to solve some of humanity’s most pressing issues, but this can only happen if everyone has the tools to build with it. Earlier this year, Google introduced Gemma, a family of lightweight, state-of-the-art open models based on the same research and technology used to create the Gemini models. They’ve since expanded the Gemma family with CodeGemma, RecurrentGemma, and PaliGemma, each offering unique capabilities for various AI tasks. These models are easily accessible through integrations with partners like Hugging Face, NVIDIA, and Ollama.

Launching Gemma 2

Google is now officially releasing Gemma 2 to researchers and developers worldwide. Available in both 9 billion (9B) and 27 billion (27B) parameter sizes, Gemma 2 offers higher performance and greater efficiency than its predecessor, along with significant safety enhancements. The 27B model provides competitive alternatives to models more than twice its size, achieving performance levels that were only possible with proprietary models as recently as last December. This performance is now achievable on a single NVIDIA H100 Tensor Core GPU or TPU host, significantly reducing deployment costs.

Setting a New Standard for Efficiency and Performance

Gemma 2 is built on a redesigned architecture, engineered for exceptional performance and inference efficiency. Here’s what sets it apart:

  • Outstanding Performance: The 27B Gemma 2 delivers the best performance in its size class, even rivaling models more than twice its size. The 9B model also leads its category, outperforming models like Llama 3 8B.
  • Efficiency and Cost Savings: The 27B model runs efficiently at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU, reducing costs while maintaining high performance. This makes AI deployments more accessible and budget-friendly.
  • Fast Inference Across Hardware: Gemma 2 is optimized for speed across various hardware setups, from powerful gaming laptops and high-end desktops to cloud-based environments. You can try Gemma 2 at full precision in Google AI Studio, use the quantized version with Gemma.cpp on your CPU, or run it on your home computer with an NVIDIA RTX or GeForce RTX via Hugging Face Transformers.

Designed for Developers and Researchers

Gemma 2 is not only more powerful but also easier to integrate into your workflows:

  • Open and Accessible: Gemma 2 is available under a commercially-friendly Gemma license, allowing developers and researchers to share and commercialize their innovations.
  • Broad Framework Compatibility: Use Gemma 2 with major AI frameworks like Hugging Face Transformers, JAX, PyTorch, and TensorFlow via native Keras 3.0, vLLM, Gemma.cpp, Llama.cpp, and Ollama. Optimized with NVIDIA TensorRT-LLM, it can run on NVIDIA-accelerated infrastructure or as an NVIDIA NIM inference microservice. Fine-tuning is possible today with Keras and Hugging Face, with additional parameter-efficient fine-tuning options in development.
  • Effortless Deployment: Starting next month, Google Cloud customers can easily deploy and manage Gemma 2 on Vertex AI.

Supporting Responsible AI Development

Google is committed to providing resources for responsible AI development, including their Responsible Generative AI Toolkit. The recently open-sourced LLM Comparator helps with in-depth evaluation of language models. You can now use its companion Python library to run comparative evaluations and visualize the results. Additionally, we are working on open-sourcing our text watermarking technology, SynthID, for Gemma models.

When training Gemma 2, Google followed rigorous safety processes, filtering pre-training data and performing extensive testing and evaluation to identify and mitigate potential biases and risks. They publish their results on public benchmarks related to safety and representational harms.

Projects Built with Gemma

The first Gemma launch led to over 10 million downloads and numerous inspiring projects. For instance, Navarasa used Gemma to create a model rooted in India’s linguistic diversity.

Looking Ahead

Gemma 2 will enable even more ambitious projects, unlocking new levels of performance and potential in AI creations. We will continue to explore new architectures and develop specialized Gemma variants for a broader range of AI tasks and challenges, including an upcoming 2.6B parameter model designed to bridge the gap between lightweight accessibility and powerful performance.

Getting Started

Gemma 2 is now available in Google AI Studio, allowing you to test its full performance capabilities at 27B without hardware requirements. You can also download Gemma 2’s model weights from Kaggle and Hugging Face Models, with Vertex AI Model Garden coming soon.

To support research and development, Gemma 2 is acessable free of charge through Kaggle or a free tier for Colab notebooks. First-time Google Cloud customers may be eligible for $300 in credits. Academic researchers can apply for the Gemma 2 Academic Research Program to receive Google Cloud credits to accelerate their research with Gemma 2. Applications are open now through August 9.

Related Posts
Salesforce Jigsaw
Salesforce Jigsaw

Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more

Health Cloud Brings Healthcare Transformation
Health Cloud Brings Healthcare Transformation

Following swiftly after last week's successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

Salesforce’s Quest for AI for the Masses
Roles in AI

The software engine, Optimus Prime (not to be confused with the Autobot leader), originated in a basement beneath a West Read more

How To Build a Connected Culture
Connected Culture

Building a connected culture has become increasingly important because mobile devices and applications have changed the landscape of the IT Read more