AI Hub
All papers
Architecture

Flamingo: a Visual Language Model for Few-Shot Learning

DeepMind·April 29, 2022

Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc

View on arXiv

TL;DR

Bridges a frozen vision encoder and language model for few-shot image-and-text tasks.

Why it matters

An influential recipe for building multimodal models on top of pretrained components.

Related terms