Seunghyun Seo on X: \"Inspired by this thread, I'd like to share my slides on training horizon scaling. Lately, lots o...
Seunghyun Seo's slides on training-horizon scaling, focusing on the role of weight decay (not just learning rate) when scaling.
This is a SimPPL canonical link to a reading shared in our newsletter. Browse the rest at simppl.org/library.
