xLSTM (Extended LSTM)
A modernized LSTM variant by Sepp Hochreiter using exponential gating and matrix memory to compete with Transformers.
xLSTM modernizes LSTMs with exponential gating and matrix memory – Sepp Hochreiter's answer to Transformers, with promising early results.
Explanation
xLSTM extends classical LSTMs through: (1) Exponential gating instead of sigmoid for better selection, (2) sLSTM (scalar memory) and mLSTM (matrix memory) as two variants. mLSTM can be trained in parallel and scales to billions of parameters.
Marketing Relevance
xLSTM marks the renaissance of RNN research – LSTMs could return as a Transformer alternative.
Common Pitfalls
Still early research phase. No large production models. Scaling behavior at >10B parameters untested.
Origin & History
Hochreiter et al. (NXAI/JKU Linz, 2024) published xLSTM as the "LSTM comeback." Beck et al. showed competitive results up to 1.3B parameters. NXAI (spin-off) drives commercialization.
Comparisons & Differences
xLSTM (Extended LSTM) vs. LSTM
Classical LSTMs use sigmoid gates and scalar memory; xLSTM uses exponential gating and optional matrix memory for more capacity.
xLSTM (Extended LSTM) vs. Mamba
Mamba uses SSM recurrence; xLSTM uses LSTM recurrence with modern extensions – different approaches for linear inference.