Menu

Blog

Jan 27, 2024

Paper page — FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Posted by in category: information science

Microsoft presents FP6-LLM

Efficiently serving large language models through fp6-centric algorithm-system co-design.


Join the discussion on this paper page.

Leave a reply