Retrieval-Augmented Generation

Grounding model answers in documents fetched at query time.

Retrieval-Augmented Generation (RAG) retrieves relevant documents from an external store and feeds them into the model’s context before it answers. This grounds responses in current, source-specific information and reduces hallucination without retraining the model.

It is the dominant pattern for building question-answering systems over private or frequently-changing data.

Related papers