MI-Fuse: Label Fusion For Unsupervised Domain Adaptation With Closed-Source Large-Audio Language Model - Takara TLDR

Large audio-language models (LALMs) show strong zero-shot ability on speech
tasks, suggesting promise for speech emotion recognition (SER). However, SER in
real-world deployments often fails under domain mismatch, where source data are
unavailable and powerful LALMs are accessible only through an API. We ask:
given only unlabeled target-domain audio and an API-only LALM, can a student
model be adapted to outperform the LALM in the target domain? To this end, we
propose MI-Fuse, a denoised label fusion framework that supplements the LALM
with a source-domain trained SER classifier as an auxiliary teacher. The
framework draws multiple stochastic predictions from both teachers, weights
their mean distributions by mutual-information-based uncertainty, and
stabilizes training with an exponential moving average teacher. Experiments
across three public emotion datasets and six cross-domain transfers show
consistent gains, with the student surpassing the LALM and outperforming the
strongest baseline by 3.9%. This approach strengthens emotion-aware speech
systems without sharing source data, enabling realistic adaptation.

Source link

What's Hot

Seedream 4.0: Toward Next-generation Multimodal Image Generation – Takara TLDR

Google’s Robots Can Now Think, Search the Web and Teach Themselves New Tricks

MI-Fuse: Label Fusion for Unsupervised Domain Adaptation with Closed-Source Large-Audio Language Model – Takara TLDR

MI-Fuse: Label Fusion for Unsupervised Domain Adaptation with Closed-Source Large-Audio Language Model – Takara TLDR

Seedream 4.0: Toward Next-generation Multimodal Image Generation – Takara TLDR

CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning – Takara TLDR

StyleBench: Evaluating thinking styles in Large Language Models – Takara TLDR

Judge Rejects Ronald Perelman’s $400 M. Art Insurance Claim

Drag Queen Alexis Stone Became the Mona Lisa for Milan Fashion Show

Steve McQueen’s Granddaughter Lawsuit for $68 M. Pollock Painting

Marina Abramović to Have Exhibition at Venice’s Accademia in 2026

Seedream 4.0: Toward Next-generation Multimodal Image Generation – Takara TLDR

Google’s Robots Can Now Think, Search the Web and Teach Themselves New Tricks