Architecture
Long Short-Term Memory
TU Munich·November 15, 1997
Sepp Hochreiter, Jürgen Schmidhuber
TL;DR
Introduces the LSTM, a gated recurrent network that can learn long-range dependencies in sequences.
Why it matters
The dominant sequence model for nearly two decades, powering translation and speech before the Transformer.