TrainingEvaluation

Language Models are Few-Shot Learners

OpenAI·May 28, 2020

Tom B. Brown, Benjamin Mann, Nick Ryder

TL;DR

GPT-3 (175B parameters) shows that sufficiently large language models can perform new tasks from a few examples in the prompt, without gradient updates — in-context learning.

Why it matters

Demonstrated that scale alone unlocks qualitatively new capabilities, and that prompting could replace task-specific training. It set off the race to build ever-larger foundation models.

Related terms

Scaling Laws