Paper page - Lizard: An Efficient Linearization Framework for Large Language Models

Lizard is a linearization framework that transforms Transformer-based LLMs into subquadratic architectures for efficient infinite-context generation, using a hybrid attention mechanism and hardware-aware training.

We propose Lizard, a linearization framework that transforms pretrained
Transformer-based Large Language Models (LLMs) into flexible, subquadratic
architectures for infinite-context generation. Transformer-based LLMs face
significant memory and computational bottlenecks as context lengths increase,
due to the quadratic complexity of softmax attention and the growing key-value
(KV) cache. Lizard addresses these limitations by introducing a subquadratic
attention mechanism that closely approximates softmax attention while
preserving the output quality. Unlike previous linearization methods, which are
often limited by fixed model structures and therefore exclude gating
mechanisms, Lizard incorporates a gating module inspired by recent
state-of-the-art linear models. This enables adaptive memory control, supports
constant-memory inference, offers strong length generalization, and allows more
flexible model design. Lizard combines gated linear attention for global
context compression with sliding window attention enhanced by meta memory,
forming a hybrid mechanism that captures both long-range dependencies and
fine-grained local interactions. Moreover, we introduce a hardware-aware
algorithm that accelerates the training speed of our models. Extensive
experiments show that Lizard achieves near-lossless recovery of the teacher
model’s performance across standard language modeling tasks, while
significantly outperforming previous linearization methods. On the 5-shot MMLU
benchmark, Lizard improves over prior models by 18 points and shows significant
improvements on associative recall tasks.

Source link

What's Hot

Foodbank Victoria uses Qlik to streamline hunger relief efforts

Introducing ChatGPT agent

Employers Unprepared for Pay Laws

Paper page – Lizard: An Efficient Linearization Framework for Large Language Models

Paper page – DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering

Paper page – MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding

Paper page – MOSPA: Human Motion Generation Driven by Spatial Audio

Chanel Will Return to New York City with Métiers d’Art Collection

Rashid Johnson Painting Spotted in Trump Official’s Home

Christie’s Reports $2.1 B. Sales Total for H1 2024

Morning Links for July 16, 2025

Foodbank Victoria uses Qlik to streamline hunger relief efforts

Introducing ChatGPT agent

Employers Unprepared for Pay Laws

What's Hot

Paper page – Lizard: An Efficient Linearization Framework for Large Language Models

Related Posts

Subscribe to Updates