Gradient Surgery For Multi-Task Learning

Multi-Task Learning can be very challenging when gradients of different tasks are of severely different magnitudes or point into conflicting directions. PCGrad eliminates this problem by projecting conflicting gradients while still retaining optimality guarantees.

Abstract:
While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge. Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks to enable more efficient learning. However, the multi-task setting presents a number of optimization challenges, making it difficult to realize large efficiency gains compared to learning tasks independently. The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood. In this work, we identify a set of three conditions of the multi-task optimization landscape that cause detrimental gradient interference, and develop a simple yet general approach for avoiding such interference between task gradients. We propose a form of gradient surgery that projects a task’s gradient onto the normal plane of the gradient of any other task that has a conflicting gradient. On a series of challenging multi-task supervised and multi-task RL problems, this approach leads to substantial gains in efficiency and performance. Further, it is model-agnostic and can be combined with previously-proposed multi-task architectures for enhanced performance.

Authors: Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn

Links:
YouTube:
Twitter:
BitChute:
Minds:

source

What's Hot

ASML Partners With Mistral AI in Strategic €1.3 Billion Deal

Law Offices of Frank R. Cruz Encourages C3.ai, Inc. (AI) Investors To Inquire About Securities Fraud Class Action

Unlock model insights with log probability support for Amazon Bedrock Custom Model Import

Gradient Surgery for Multi-Task Learning

AGI is not coming!

Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)

Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)

Ohio Auction of Two Paintings Looted By Nazis Halted By Foundation

Nicholas Galanin Pulls Out of Smithsonian Event, Claiming Censorship

Two More Staffers Fired from Kennedy Center after Trump Takeover

Long-Lost Painting By Rubens From 1613 Discovered in Paris Mansion

ASML Partners With Mistral AI in Strategic €1.3 Billion Deal

Law Offices of Frank R. Cruz Encourages C3.ai, Inc. (AI) Investors To Inquire About Securities Fraud Class Action

Unlock model insights with log probability support for Amazon Bedrock Custom Model Import

What's Hot

Gradient Surgery for Multi-Task Learning

Related Posts

Subscribe to Updates