Toggle light / dark theme

Rho-1: Not All Tokens Are What You Need

Posted in futurism

Microsoft presents Rho-1

Not All Tokens Are What You Need https://huggingface.co/papers/2404.

Previous language model pre-training methods have uniformly applied a next-token #prediction #LOSS to all training #tokens.


Join the discussion on this paper page.