Deconstructing Lottery Tickets: Zeros, Signs, And The Supermask (Paper Explained)

This paper dives into the intrinsics of the Lottery Ticket Hypothesis and attempts to shine some light on what’s important and what isn’t.

Abstract:
The recent “Lottery Ticket Hypothesis” paper by Frankle & Carbin showed that a simple approach to creating sparse networks (keeping the large weights) results in models that are trainable from scratch, but only when starting from the same initial weights. The performance of these networks often exceeds the performance of the non-sparse base model, but for reasons that were not well understood. In this paper we study the three critical components of the Lottery Ticket (LT) algorithm, showing that each may be varied significantly without impacting the overall results. Ablating these factors leads to new insights for why LT networks perform as well as they do. We show why setting weights to zero is important, how signs are all you need to make the reinitialized network train, and why masking behaves like training. Finally, we discover the existence of Supermasks, masks that can be applied to an untrained, randomly initialized network to produce a model with performance far better than chance (86% on MNIST, 41% on CIFAR-10).

Authors: Hattie Zhou, Janice Lan, Rosanne Liu, Jason Yosinski

Links:
YouTube:
Twitter:
BitChute:
Minds:

source

What's Hot

A Busy Week For Big Financings, Led By Databricks And PsiQuantum

Micro1, a competitor to Scale AI, raises funds at $500M valuation

After Anthropic’s Billion-Dollar Settlement, Dictionaries Are Suing Perplexity AI

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask (Paper Explained)

AGI is not coming!

Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)

Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)

Nicholas Galanin Pulls Out of Smithsonian Event, Claiming Censorship

Two More Staffers Fired from Kennedy Center after Trump Takeover

Long-Lost Painting By Rubens From 1613 Discovered in Paris Mansion

Ken Griffin Loves Pollock’s Blue Poles So Much He Tried to Buy it

A Busy Week For Big Financings, Led By Databricks And PsiQuantum

Micro1, a competitor to Scale AI, raises funds at $500M valuation

After Anthropic’s Billion-Dollar Settlement, Dictionaries Are Suing Perplexity AI

What's Hot

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask (Paper Explained)

Related Posts

Subscribe to Updates