Pretraining
The large-scale self-supervised stage that builds a model’s base knowledge.
Pretraining trains a model on vast amounts of unlabeled data with a self-supervised objective — typically next-token prediction. This is where a model acquires its broad knowledge and language ability, before any task-specific fine-tuning or alignment.
It is by far the most compute-intensive phase of building a foundation model.