ArchitectureReinforcement LearningEvaluation
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
MiniMax AI·June 16, 2025
MiniMax
View on arXivTL;DR
The first open-weight large-scale hybrid-attention reasoning model: a 456B MoE (45.9B active) interleaving linear “lightning” attention with softmax attention to make long-context test-time compute far cheaper. Introduces the CISPO RL algorithm.
Why it matters
Showed that linear/lightning attention is viable at frontier scale for reasoning and that hybrid attention sharply cuts the cost of long reasoning generations.