GitHub - leoheuler/flashtensors · GitHub
flashtensors: run 100 large models on a single GPU with minimal time-to-first-token impact via tensor swap.
Permalink: simppl.org/library/item/github-leoheuler-flashtensors-github-ccf2d599
This is a SimPPL canonical link to a reading shared in our newsletter. Browse the rest at simppl.org/library.
