Reinforcement LearningTraining

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek·January 22, 2025

DeepSeek-AI

TL;DR

Shows that large-scale reinforcement learning alone can elicit strong reasoning, producing an open model competitive with o1 — and distills it into smaller models.

Why it matters

A landmark open-reasoning result: it demonstrated RL-driven reasoning at frontier quality, released the weights openly, and reset expectations on cost.

Related models

DeepSeek-R1DeepSeek

Related terms

Reasoning Models