The technique, called “universal transformer memory,” uses special neural networks to optimize LLMs to keep bits of information that matter and discard redundant details from their context.
Sakana AI’s LLM Optimization Technique Slashes Memory Costs Up To 75%
Posted in robotics/AI