Mike's Daily Paper: 02.08.25 - Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
Share this post
Do All Tokens Need the Same Amount of…
Share this post
Mike's Daily Paper: 02.08.25 - Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation