Google DeepMind releases DiffusionGemma, a model that runs local AI 4x faster

Google has released a new AI model called DiffusionGemma that generates text four times faster than comparable models, making it practical to run on regular gaming computers instead of expensive server hardware. Unlike most AI assistants that write one word at a time, this model thinks through entire blocks of text at once—similar to how image generators work—which is why it's so much quicker. The catch is that it's still quite large, though Google designed it to fit within the memory limits of high-end consumer GPUs.
Another day, another AI model from Google. This time, Google DeepMind has released a new member of the Gemma 4 open model family, but it's fundamentally different from the rest of the lineup. DiffusionGemma doesn't generate outputs linearly like most AI models. Instead, it can produce an entire bloc
More from Best AI Tools
Get new guides every week
Real AI income strategies, tool reviews, and plain-English news — free in your inbox.



