Architecture
Mistral 7B
Mistral AI·October 10, 2023
Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch
View on arXivTL;DR
Introduces Mistral 7B, which outperforms larger Llama 2 models using grouped-query and sliding-window attention for efficient inference.
Why it matters
Proved a well-designed 7B model could beat much larger ones, launching Mistral and reinvigorating small open models.