Paper page — DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference Posted by Cecile G. Tamura in futurism Jan 172024 Join the discussion on this paper page. Read more | >