Google's Gemma 4 model goes fully open-source and unlocks powerful local AI – even on phones

Elyse Betters Picaro / ZDNET

Follow ZDNET: Add us as a preferred source on Google.

ZDNET’s key takeaways

Gemma 4 is now fully open-source under Apache 2.0.
Local AI enables privacy, offline use, and lower costs.
From servers to smartphones, deployment just got much easier.

Google announced today that its DeepMind AI research division is releasing Gemma 4, its latest generation of open large language models. The models are being released under the Apache 2.0 license, making them truly open source compared to the permissive but still controlled license of earlier Gemma generations.

What is Gemma?

Gemma is an LLM like Gemini. But here, we’re talking about the AI processing engine, not the chatbot interface. Both Gemma and Gemini were developed using the same research and technology. The difference is that Gemini is a subscription-based closed product, whereas Gemma is an open model that can be downloaded and run locally for free.

The ability to run an AI model locally without a fee benefits a variety of applications. There are plenty of folks who want to run AI at home, without relying on the cloud, and for free.

Also: How AI has suddenly become much more useful to open-source developers

The ability to keep everything local is particularly important to enterprises that have data sovereignty or confidentiality requirements. For example, healthcare providers might have regulatory restrictions that prevent them from sharing patient data with a public cloud provider, yet they would still like to benefit from AI. By running the entire system locally, no data is sent to the cloud, but the AI capability is still available.

There are many devices, ranging from smartphones to a whole bunch of IoT and edge devices, that may have only intermittent network connectivity (or none at all). Being able to run AI operations without additional costs and without the need to phone home provides considerable benefits in terms of flexibility, security, and cost control.

Also: I used Gmail’s AI tool to do hours of work for me in 10 minutes – with 3 prompts

So, while you might run Gemini in your chat interface, you might install Gemma on a Raspberry Pi to monitor a process in a factory and make decisions in real-time without the latency of a round trip to the cloud and back.

The big licensing news

Earlier versions of Gemma were licensed under a Gemma Terms of Use statement, rather than a formal open-source license structure. Google permitted users to download Gemma, use it locally, and make modifications, but they restricted use to approved categories and limited redistribution.

This approach allowed the model family to be called “open” but not “open source.” There were many freedoms associated with using Gemma, but Google still held the leash.

By contrast, the Apache 2.0 license grants nearly total freedom. Users and developers can use the software for any purpose, whether personal, commercial, or enterprise, and without any royalty requirements. If you do distribute the software, you’re obligated to include a copy of the Apache 2.0 license and provide required attribution for the software.

Users and developers are free to modify and redistribute the code, with the right to create derivative works and distribute both the original and modified versions.

Also: Why AI is both a curse and a blessing to open-source software

There are also some interesting patent-related protections and penalties. In terms of protections, Apache 2.0-licensed users are granted a license to any patents covering contributions, so that patent lawsuits can’t target users merely for using the software. On the other hand, if you sue someone claiming the software infringes your patent, you automatically lose your license to use the software.

Google is no longer using its own terms of use for Gemma 4. Instead, they’re licensing Gemma 4 under the Apache 2.0 license, which means users and developers can use and distribute the model in any way they want without restrictions.

The Gemmaverse

Since the release of Gemma two years ago, in February 2024, the open model has experienced considerable adoption.

According to Clement Farabet, VP of research, and Olivier Lacombe, group product manager at Google DeepMind, “Since the launch of our first generation, developers have downloaded Gemma over 400 million times, building a vibrant Gemmaverse of more than 100,000 variants.”

Also: 7 AI coding techniques I use to ship real, reliable products – fast

But as ZDNET reported back then, “Google’s latest AI offering is an ‘open model’ but not ‘open-sourced.’ That difference matters.” That was then, and this is now.

Now, Gemma 4 is being released as pure open-source software, which means we can expect adoption rates to pick up even over what we’ve seen in the past 26 months. Not only can we expect to see Gemma 4 adopted in more projects, but it’s also now legitimately possible to bundle the AI with products, services, and devices that can benefit from a powerful on-board model.

Model capabilities

Gemma 4 is actually a four-model set. Two of the models are designed for higher-end servers with powerful GPUs, such as Nvidia H100. These models, known as 26B and 31B, have large parameter footprints. The 26B version focuses on reducing latency, activating a subset of its total parameter set for inference. The 31B model is designed to maximize raw power and quality, bringing all its capabilities to any problem it’s asked to work on.

The other two models are designed for the low end. Called E2B and E4B, these models are intended for mobile and IoT devices, although they’ll also work well running on your home PC. These models have two and four-billion-parameter footprints, respectively, limiting device impact so that they can run efficiently on mobile and edge devices.

Also: I built two apps with just my voice and a mouse – are IDEs already obsolete?

According to Google’s Farabet and Lacombe, “In close collaboration with our Google Pixel team and mobile hardware leaders like Qualcomm Technologies and MediaTek, these multimodal models run completely offline with near-zero latency across edge devices like phones, Raspberry Pi, and Jetson Nano.”

The company says all models support the following capabilities:

Advanced reasoning: Gemma 4 is capable of multi-step planning and deep logic.
Agentic workflows: Gemma 4 can deploy autonomous agents that interact with different tools and APIs, and execute workflows reliably.
Security: Gemma models “Undergo the same rigorous infrastructure security protocols as our proprietary models,” according to the announcement blog post.
Code generation: Gemma 4 supports offline code generation. This capability could prove to be a huge boon to those stuck on very long plane flights without a network connection.
Vision and audio: According to Google, “All models natively process video and images, support variable resolutions, and excel at visual tasks like OCR and chart understanding. Additionally, the E2B and E4B models feature native audio input for speech recognition and understanding.”
Longer context: The E2B and E4B models support a 128K context window, allowing for a surprisingly large working memory for a small and portable mode. The larger models support up to a 256K context window, allowing users to “pass repositories or long documents in a single prompt.”
Multi-language support: Google said Gemma 4 has been natively trained on over 140 languages.

There is no indication that Conversational Klingon is among the languages. However, given that Gemma 4 has been trained on a massive scrape of the public web, and that there is a dedicated community, a dictionary, and plenty of fan-generated content online, Klingon almost certainly appeared in the training data, which means the model should be able to perform some rudimentary translation at the least.

In their blog post, Farabet and Lacombe said, “Gemma 4 outcompetes models 20x its size. For developers, this new level of intelligence-per-parameter means achieving frontier-level capabilities with significantly less hardware overhead.”

If you could deploy Gemma 4 on a local device today, what would be the first real task you would trust it to handle? Let us know in the comments below.

You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.

Google's Gemma 4 model goes fully open-source and unlocks powerful local AI – even on phones

ZDNET’s key takeaways

What is Gemma?

The big licensing news

The Gemmaverse

Model capabilities

Featured

Related Posts

New MIT jobs report: Why AI's work impact will roll in like a rising tide, not a crashing wave

New out-of-band Windows 11 update fixes March's installation errors – how to get it

‘

Netflix on Apple TV devices just lost a load of useful features and became more

‘You can’t do everything’: Tim Cook says he has a ‘ruthless filter’ when it comes to approving new ideas at Apple

Still running iOS 18? Your device could be at risk from a vicious malware strain, but Apple has just released a fix

Unmissable MacBook Pro M5 deals — celebrate 50 years of Apple by saving on the most stylish business-class laptops we