S4 (Structured State Spaces)
The groundbreaking state space architecture combining HiPPO initialization with efficient convolution computation that sparked the SSM revolution.
S4 combines HiPPO initialization with convolution training – the breakthrough that enabled Mamba and the entire SSM revolution.
Explanation
S4 solves the SSM training problem: HiPPO matrix for long-range dependencies, DPLR parameterization for stability, and computation as convolution for GPU parallelization. First SSM approach to dominate the Long-Range Arena (LRA).
Marketing Relevance
S4 is the foundation for Mamba, Hyena, and all modern SSM architectures.
Common Pitfalls
S4 alone is weaker than Transformer for language. Complex mathematics (diagonalization, Cauchy kernel). Surpassed by Mamba for language.
Origin & History
Gu et al. (Stanford, 2021) published S4 and dominated the Long-Range Arena. S4D (2022) simplified parameterization. S5, H3, and Hyena followed as variants. Mamba (2023) used selective SSMs and surpassed S4 for language.
Comparisons & Differences
S4 (Structured State Spaces) vs. Mamba
S4 uses fixed (time-invariant) SSM parameters; Mamba makes parameters input-dependent (selective) – key innovation for language.