SimPPL

Jiayi Pan on X: \"We reproduced DeepSeek R1-Zero in the CountDown game, and it just works Through RL, the 3B base LM d...

Source
https://twitter.com/jiayi_pirate/status/1882839370505621655?s=12&t=II5sEKhd9mEFeaunS9qgEA
Tags
twittergithub

Permalink: simppl.org/library/item/jiayi-pan-on-x-we-reproduced-deepseek-r1-zero-in-the-countdown-game-an-121d5dbf

This is a SimPPL canonical link to a reading shared in our newsletter. Browse the rest at simppl.org/library.