
Tech
Google Implements Multi-Token Prediction to Triple Gemma 4 Inference Speed
The company introduced 'drafter' models that use speculative decoding to accelerate the Gemma 4 family of open AI models by up to 3x.
Topic
1 story on machine-learning, across Tech.

The company introduced 'drafter' models that use speculative decoding to accelerate the Gemma 4 family of open AI models by up to 3x.