smart_toyAI/PROMPT ENGINEERING

DiffusionGemma

sourceSimon Willison

calendar_todayJune 10, 2026

schedule1 min read

lightbulb

EXECUTIVE SUMMARY

Google's Gemma Model Revolutionizes AI with New Open Weights

Summary

Google has reintroduced its experimental Gemini Diffusion model as the open-weight Gemma model, now hosted by NVIDIA. This model showcases impressive performance metrics, generating tokens at a rapid pace.

Key Points

Google released an experimental Gemini Diffusion model in May 2023.
The new open-weight model is named `google/diffusiongemma-26B-A4B-it` and is licensed under Apache 2.
NVIDIA is hosting the Gemma model for free on their NIM cloud API.
The model achieved a performance of 857 tokens/second during initial testing.
A recent test using the API generated 2,409 tokens in 4.4 seconds, equating to over 500 tokens/second.
The model is part of the generative AI and large language model (LLM) landscape.

Analysis

The re-release of the Gemma model as an open-weight version signifies a step forward in the accessibility of advanced AI tools. By hosting it on NVIDIA's NIM cloud API, Google and NVIDIA are enabling developers and researchers to leverage cutting-edge generative AI capabilities without the need for extensive infrastructure.

Conclusion

IT professionals should explore the capabilities of the Gemma model for applications in generative AI. Utilizing the NIM cloud API can enhance their projects by integrating advanced language model functionalities.