GRU (Gated Recurrent Unit)
GRU is a simplified RNN architecture with update and reset gates – fewer parameters than LSTM with comparable performance.
GRU is the leaner alternative to LSTM – two instead of three gates, faster training, similar performance for sequence processing.
Explanation
GRU combines LSTM's forget and input gates into a single update gate. The reset gate controls how much past context flows in. Faster to train than LSTM, often similarly good results.
Marketing Relevance
Historically important for sequence modeling, now largely replaced by transformers. Still relevant for edge deployment and small models.
Origin & History
Cho et al. (2014) introduced GRU as a more efficient alternative to LSTM (1997). GRUs became particularly popular in machine translation and speech. From 2017, transformers replaced both architectures for most NLP tasks.
Comparisons & Differences
GRU (Gated Recurrent Unit) vs. LSTM
LSTM has 3 gates (forget, input, output) + cell state; GRU has 2 gates (update, reset) without separate cell state – simpler but slightly less expressive.
GRU (Gated Recurrent Unit) vs. Transformer
GRU processes sequentially (slow, short context); Transformer parallelizes with attention (fast, long context).