AI Hub
All papers
ArchitectureTraining

DeepSeek-V3 Technical Report

DeepSeek·December 27, 2024

DeepSeek-AI

View on arXiv

TL;DR

Describes DeepSeek-V3, a 671B-parameter MoE model trained at a fraction of typical frontier cost, with detailed efficiency and training innovations.

Why it matters

Its training-efficiency claims reframed the economics of frontier models and intensified debate over how much compute capability really requires.

Related models

Related terms