Mixture-of-Recursions (MoR)

An architecture that recursively reuses shared layers and routes each token to its own recursion depth for adaptive computation.

MoR ties weights across a recursive layer stack (cutting parameters) and uses lightweight routers to give each token a custom number of recursive passes, concentrating compute on harder tokens.

It unifies parameter sharing with adaptive depth — a successor idea to mixture-of-depths and early-exit — improving the size/throughput/accuracy trade-off.

Related papers