arXiv AI

[2504.10561] Self-Controlled Dynamic Expansion Model for Continual Learning

By Advanced AI EditorApril 17, 2025No Comments2 Mins Read

[Submitted on 14 Apr 2025 (v1), last revised 16 Apr 2025 (this version, v2)]

View a PDF of the paper titled Self-Controlled Dynamic Expansion Model for Continual Learning, by Runqing Wu and 3 other authors

View PDF
HTML (experimental)

Abstract:Continual Learning (CL) epitomizes an advanced training paradigm wherein prior data samples remain inaccessible during the acquisition of new tasks. Numerous investigations have delved into leveraging a pre-trained Vision Transformer (ViT) to enhance model efficacy in continual learning. Nonetheless, these approaches typically utilize a singular, static backbone, which inadequately adapts to novel tasks, particularly when engaging with diverse data domains, due to a substantial number of inactive parameters. This paper addresses this limitation by introducing an innovative Self-Controlled Dynamic Expansion Model (SCDEM), which orchestrates multiple distinct trainable pre-trained ViT backbones to furnish diverse and semantically enriched representations. Specifically, by employing the multi-backbone architecture as a shared module, the proposed SCDEM dynamically generates a new expert with minimal parameters to accommodate a new task. A novel Collaborative Optimization Mechanism (COM) is introduced to synergistically optimize multiple backbones by harnessing prediction signals from historical experts, thereby facilitating new task learning without erasing previously acquired knowledge. Additionally, a novel Feature Distribution Consistency (FDC) approach is proposed to align semantic similarity between previously and currently learned representations through an optimal transport distance-based mechanism, effectively mitigating negative knowledge transfer effects. Furthermore, to alleviate over-regularization challenges, this paper presents a novel Dynamic Layer-Wise Feature Attention Mechanism (DLWFAM) to autonomously determine the penalization intensity on each trainable representation layer. An extensive series of experiments have been conducted to evaluate the proposed methodology’s efficacy, with empirical results corroborating that the approach attains state-of-the-art performance.

Submission history

From: Runqing Wu [view email]
[v1]
Mon, 14 Apr 2025 15:22:51 UTC (2,730 KB)
[v2]
Wed, 16 Apr 2025 01:13:45 UTC (2,730 KB)

Previous ArticleJetBrains, GitHub add coding agents to IDEs

Next Article How to try Google’s Veo 2 AI video generator – and what you can do with it

Advanced AI Editor

Leave A Reply