👉 Researchers have developed a method called Selective Language Modeling (SLM), which trains language models more efficiently by focusing on the most relevant tokens.
Researchers introduce a new method called “Selective Language Modeling” that trains language models more efficiently by focusing on the most relevant tokens.
The method leads to significant performance improvements in mathematical tasks, according to a new paper from researchers at Microsoft, Xiamen University, and Tsinghua University. Instead of considering all tokens in a text corpus equally during training as before, Selective Language Modeling (SLM) focuses specifically on the most relevant tokens.
The researchers first analyzed the training dynamics at the token level. They found that the loss for different token types develops very differently during training. Some tokens are learned quickly, others hardly at all.
Comments are closed.