Mike’s Substack

Mike’s Substack

Share this post

Mike’s Substack
Mike’s Substack
Fine-tuning LLM with RL from the angle of the memory
Copy link
Facebook
Email
Notes
More

Fine-tuning LLM with RL from the angle of the…

Mike Erlihson, Mathy AI
Feb 4
3

Share this post

Mike’s Substack
Mike’s Substack
Fine-tuning LLM with RL from the angle of the memory
Copy link
Facebook
Email
Notes
More

How many models you need to store in memory for PPO and GRPO?

Read →
Comments
User's avatar
© 2025 Mike E.
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More