Dec 162024 LLMs don’t need all the attention layers, study shows LLMs can shed a substantial portion of their attention layers without hurting their performance.