Employing Self-supervised Learning Models For Cross-linguistic Child Speech Maturity Classification

arXiv:2506.08999v1 Announce Type: cross
Abstract: Speech technology systems struggle with many downstream tasks for child speech due to small training corpora and the difficulties that child speech pose. We apply a novel dataset, SpeechMaturity, to state-of-the-art transformer models to address a fundamental classification task: identifying child vocalizations. Unlike previous corpora, our dataset captures maximally ecologically-valid child vocalizations across an unprecedented sample, comprising children acquiring 25+ languages in the U.S., Bolivia, Vanuatu, Papua New Guinea, Solomon Islands, and France. The dataset contains 242,004 labeled vocalizations, magnitudes larger than previous work. Models were trained to distinguish between cry, laughter, mature (consonant+vowel), and immature speech (just consonant or vowel). Models trained on the dataset outperform state-of-the-art models trained on previous datasets, achieved classification accuracy comparable to humans, and were robust across rural and urban settings.

Source link

What's Hot

WorkFusion, With Several Big Banks As Customers, Lands $45M For AI Agents ‘To Stop Bad Actors’

Google launches new protocol for agent-driven purchases

You Ai Zhi He Obtains Patent for Mobile Robot Control, Accelerating the Process of Industrial Automation_patent_The

Employing self-supervised learning models for cross-linguistic child speech maturity classification

LTLCrit: A Temporal Logic-based LLM Critic for Safe and Efficient Embodied Agents

From Imitation to Innovation: The Emergence of AI Unique Artistic Styles and the Challenge of Copyright Protection

VerifyLLM: LLM-Based Pre-Execution Task Plan Verification for Robots

David Lynch’s Los Angeles Home and Studio on Sale for $15 M.

Picasso Inspires Name of Newly Discovered Microsnail

Rare Hieroglyphic Decree Identified in Egypt

Bristol Museum Requires $5.4 M. in Repairs for 120-Year-Old Home

WorkFusion, With Several Big Banks As Customers, Lands $45M For AI Agents ‘To Stop Bad Actors’

Google launches new protocol for agent-driven purchases

You Ai Zhi He Obtains Patent for Mobile Robot Control, Accelerating the Process of Industrial Automation_patent_The

What's Hot

Employing self-supervised learning models for cross-linguistic child speech maturity classification

Related Posts

Subscribe to Updates