AI Hub
All papers
Training

Adam: A Method for Stochastic Optimization

University of Toronto·December 22, 2014

Diederik P. Kingma, Jimmy Ba

View on arXiv

TL;DR

Introduces Adam, an adaptive optimizer that became the default for training deep networks.

Why it matters

One of the most-cited papers in all of science — the optimizer almost everything is trained with.

Related terms