Jan 172024 Paper page — DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference Join the discussion on this paper page.