Google Found a Way to Make Local AI Up to 3x Faster—No New Hardware Required

7 May, 14:13

Source: Decrypt

AI Summary

Google announces Multi-Token Prediction technology enabling Gemma 4 to run up to 3x faster on local hardware without quality loss, representing a significant AI infrastructure advancement.

Google's new Multi-Token Prediction drafters can make Gemma 4 run up to 3x faster on your own hardware—no cloud required, and no quality lost.