Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence
    (Griffin)

    Griffin (Google)

    Updated: 2/11/2026

    Google's hybrid architecture combining linear recurrences (gated RNN) with local attention, productionized in RecurrentGemma.

    Quick Summary

    Griffin combines gated linear recurrence with local attention – Google's hybrid architecture, productionized as RecurrentGemma.

    Explanation

    Griffin uses Real-Gated Linear Recurrence Units (RG-LRU) as an efficient recurrence layer combined with local sliding window attention. RecurrentGemma (2B/9B) shows this hybrid architecture can achieve Transformer quality with significantly less inference memory.

    Marketing Relevance

    Griffin/RecurrentGemma is the first Transformer alternative from Google in production – a signal for the future of hybrid architectures.

    Common Pitfalls

    Only validated in small models (2B/9B). Little community adoption. Not used for Gemini internally at Google.

    Origin & History

    De et al. (Google DeepMind, 2024) introduced Griffin and the Hawk baseline. RecurrentGemma (2024) made Griffin available as an open-source model. Showed competitive results against Gemma at significantly lower inference cost.

    Comparisons & Differences

    Griffin (Google) vs. Jamba

    Jamba uses Mamba SSM + Attention; Griffin uses gated linear recurrence + local attention – different recurrence mechanisms.

    Griffin (Google) vs. Gemma

    Gemma is pure Transformer; Griffin/RecurrentGemma partially replaces global attention with recurrence for better inference efficiency.

    Related Services

    Related Terms

    👋Questions? Chat with us!