Paper page - Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages

Automatic speech recognition systems have undoubtedly advanced with the
integration of multilingual and multitask models such as Whisper, which have
shown a promising ability to understand and process speech across a wide range
of languages. Despite their robustness, these models often fall short in
handling the linguistic distinctions of minority languages. This study
addresses this gap by integrating traditional and novel language models with
fine-tuned Whisper models to raise their performance in less commonly studied
languages. Through rigorous fine-tuning and evaluation across multiple
datasets, we demonstrate substantial improvements in word error rate,
particularly in low-resource scenarios. Our approach not only does take
advantage of the extensive data Whisper was pre-trained on, but also
complements its linguistic adaptability by incorporating language models. We
obtained improvements up to 51\% for in-distribution datasets and up to 34\%
for out-of-distribution sentences using statistical language models, while
large language models provided moderate but consistently robust improvement
across diverse linguistic contexts. The findings reveal that, while the
integration reliably benefits all model sizes, the extent of improvement
varies, highlighting the importance of optimized language model parameters.
Finally, we emphasize the importance of selecting appropriate evaluation
parameters when reporting the results using transformer-based ASR models. In
summary, this research clears the way for more inclusive ASR technologies that
perform better across languages by enriching their linguistic knowledge. For
further implementation details of this study, the technical documentation and
source code are available at http://www.github.com/hitz-zentroa/whisper-lm.

Source link

What's Hot

Paper page – Enhanced Arabic Text Retrieval with Attentive Relevance Scoring

Andrew Garfield, other celebrities, seen filming movie in SF about OpenAI CEO Sam Altman

Perplexity AI Predicts XRP, Shiba Inu, Pepe Prices by 2025

Paper page – Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages

Paper page – Enhanced Arabic Text Retrieval with Attentive Relevance Scoring

Paper page – Efficient Machine Unlearning via Influence Approximation

Paper page – Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Artist Tyrrell Winston Sues New Orleans Pelicans Over Instagram Posts

Blum Staffers Speak On Closure, Spiegler Slams Art ‘Financialization’

Theatre Director and Artist Dies at 83

France to Accelerate Return of Looted Artworks—and More Art News

Paper page – Enhanced Arabic Text Retrieval with Attentive Relevance Scoring

Andrew Garfield, other celebrities, seen filming movie in SF about OpenAI CEO Sam Altman

Perplexity AI Predicts XRP, Shiba Inu, Pepe Prices by 2025

What's Hot

Paper page – Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages

Related Posts

Subscribe to Updates