Training
Textbooks Are All You Need
Microsoft·June 20, 2023
Suriya Gunasekar, Yi Zhang, Jyoti Aneja
View on arXivTL;DR
Shows that small models trained on curated, textbook-quality data can rival much larger ones (the Phi line).
Why it matters
Made the case that data quality can substitute for raw scale, shaping the small-model wave.