SimPPL

Junxian He on X: \"We replicated the DeepSeek-R1-Zero and DeepSeek-R1 training on 7B model with only 8K examples, the...

Source
https://twitter.com/junxian_he/status/1883183099787571519
Tags
twittergithub

Permalink: simppl.org/library/item/junxian-he-on-x-we-replicated-the-deepseek-r1-zero-and-deepseek-r1-tra-d906542c

This is a SimPPL canonical link to a reading shared in our newsletter. Browse the rest at simppl.org/library.