Reinforcement LearningTraining
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek·January 22, 2025
DeepSeek-AI
View on arXivTL;DR
Shows that large-scale reinforcement learning alone can elicit strong reasoning, producing an open model competitive with o1 — and distills it into smaller models.
Why it matters
A landmark open-reasoning result: it demonstrated RL-driven reasoning at frontier quality, released the weights openly, and reset expectations on cost.