Gemma 2 is now available

and today let's take a look at what iterations Gemma 2 has.

Overview

), achieving performance that previously could only be reached by proprietary models in December last year (feels like they're talking about GPT-4). It can run on NVIDIA H100 Tensor Core GPUs or TPU hosts, significantly reducing deployment costs.

Features

: The 27B parameter version of Gemma 2 delivers top-tier performance, even surpassing models that are more than twice its size. The 9B parameter version of Gemma 2 also outperforms Llama 3 8B and other similar open models.
: The 27B parameter version of Gemma 2 runs efficiently on Google Cloud TPUs, NVIDIA A100 80GB, or H100, reducing costs while maintaining high performance, making AI deployment more economical.
: After optimization, Gemma 2 can run at high speed on various hardware, from gaming laptops to cloud setups. Experience full precision in Google AI Studio, unlock local performance on CPUs via Gemma.cpp, or use NVIDIA RTX or GeForce RTX on home computers.

Evaluation

Officially provided benchmark results:

LMSYS Chatbot Arena Leaderboard：

Other products in the Gemma family

: This is a multi-functional, lightweight vision-language model (VLM) inspired by PaLI-3, built using open components such as the SigLIP vision model and the Gemma language model.
: An open model with a fixed state size, suitable for fast reasoning on long sequences.
: An open code model based on Gemma.

Trial

In addition to Google AI Studio mentioned above, Gemma 2 is easily accessible through integrations with platforms such as Hugging Face, NVIDIA, and Ollama.

Ollama

9B Parameters

ollama run gemma2

27B Parameters

ollama run gemma2:27b

Using Gemma 2 in orchestration tools

LangChain

from langchain_community.llms import Ollama
llm = Ollama(model="gemma2")
llm.invoke("Why is the sky blue?")

LlamaIndex

from llama_index.llms.ollama import Ollama
llm = Ollama(model="gemma2")
llm.complete("Why is the sky blue?")