Mar 92024 Unveiling Infinite Context Windows: Leveraging LLMs in Streaming Apps with Attention Sinks Year 2023 LLMs trained with a finite attention window can be extended to infinite sequence lengths without any fine-tuning.