Dec 10, 2023

StripedHyena: A new architecture for next-generation generative AI?

Posted by in category: robotics/AI

GPT-4 and other models rely on transformers. With StripedHyena, researchers present an alternative to the widely used architecture.

With StripedHyena, the Together AI team presents a family of language models with 7 billion parameters. What makes it special: StripedHyena uses a new set of AI architectures that aim to improve training and inference performance compared to the widely used transformer architecture, used for example in GPT-4.

The release includes StripedHyena-Hessian-7B (SH 7B), a base model, and StripedHyena-Nous-7B (SH-N 7B), a chat model. These models are designed to be faster, more memory efficient, and capable of processing very long contexts of up to 128,000 tokens. Researchers from HazyResearch, hessian. AI, Nous Research, MILA, HuggingFace, and the German Research Centre for Artificial Intelligence (DFKI) were involved.

Leave a reply