Architecture
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Carnegie Mellon·December 1, 2023
Albert Gu, Tri Dao
View on arXivTL;DR
A selective state-space model that matches Transformers while scaling linearly with sequence length.
Why it matters
Reignited interest in attention-free architectures for long-context, efficient modeling.