TrainingEvaluation
Language Models are Few-Shot Learners
OpenAI·May 28, 2020
Tom B. Brown, Benjamin Mann, Nick Ryder
View on arXivTL;DR
GPT-3 (175B parameters) shows that sufficiently large language models can perform new tasks from a few examples in the prompt, without gradient updates — in-context learning.
Why it matters
Demonstrated that scale alone unlocks qualitatively new capabilities, and that prompting could replace task-specific training. It set off the race to build ever-larger foundation models.