Privacy Backdoors: Stealing Data With Corrupted Pretrained Models (Paper Explained)

#llm #privacy #finetuning

Can you tamper with a base model in such a way that it will exactly remember its fine-tuning data? This paper presents a method of doing exactly that, and implements it in modern transformers.

OUTLINE:
0:00 – Intro & Overview
10:50 -Core idea: single-use data traps
44:30 – Backdoors in transformer models
58:00 – Additional numerical tricks
1:00:35 – Experimental results & conclusion

Paper:
Code:

Abstract:
Practitioners commonly download pretrained machine learning models from open repositories and finetune them to fit specific applications. We show that this practice introduces a new risk of privacy backdoors. By tampering with a pretrained model’s weights, an attacker can fully compromise the privacy of the finetuning data. We show how to build privacy backdoors for a variety of models, including transformers, which enable an attacker to reconstruct individual finetuning samples, with a guaranteed success! We further show that backdoored models allow for tight privacy attacks on models trained with differential privacy (DP). The common optimistic practice of training DP models with loose privacy guarantees is thus insecure if the model is not trusted. Overall, our work highlights a crucial and overlooked supply chain attack on machine learning privacy.

Authors: Shanglun Feng, Florian Tramèr

Links:
Homepage:
Merch:
YouTube:
Twitter:
Discord:
LinkedIn:

If you want to support me, the best thing to do is to share out the content 🙂

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar:
Patreon:
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

source

What's Hot

Lost Money on C3.ai, Inc. (AI)? Join Class Action Suit Seeking Recovery – Contact Levi & Korsinsky

Former Google DeepMind Core Developer Joins xAI to Assist in Grok Development_Tran_the_his

Alibaba Unveils AI Model for Character Animation and Replacement

Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained)

AGI is not coming!

Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)

Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)

Hidden Portrait May Be Vermeer’s Earliest Known Work

Who Are the Art World Figures on the Time 100 List?

Acquavella Signs Harumi Klossowska de Rola, Daughter of Balthus

Heirs of Jewish Collector Urge Court to Reconsider Claim to Sunflowers

Lost Money on C3.ai, Inc. (AI)? Join Class Action Suit Seeking Recovery – Contact Levi & Korsinsky

Former Google DeepMind Core Developer Joins xAI to Assist in Grok Development_Tran_the_his

Alibaba Unveils AI Model for Character Animation and Replacement

What's Hot

Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained)

Related Posts

Subscribe to Updates