Training
Batch Normalization: Accelerating Deep Network Training
Google·February 11, 2015
Sergey Ioffe, Christian Szegedy
View on arXivTL;DR
Normalizes layer activations per mini-batch, dramatically stabilizing and speeding up training.
Why it matters
A key technique that made very deep networks practical to train.