Llama-GENBA-10B: A Trilingual Large Language Model For German, English And Bavarian - Takara TLDR

We present Llama-GENBA-10B, a trilingual foundation model addressing
English-centric bias in large language models. Built on Llama 3.1-8B and scaled
to 10B parameters, Llama-GENBA-10B is continuously pretrained on 164B tokens
(82B English, 82B German, and 80M Bavarian), balancing resources while
preventing English dominance. Targeted at the German NLP community, the model
also promotes Bavarian as a low-resource language. Development tackled four
challenges: (1) curating a multilingual corpus despite Bavarian scarcity, (2)
creating a unified tokenizer for English, German, and Bavarian, (3) optimizing
architecture and language-ratio hyperparameters for cross-lingual transfer, and
(4) establishing the first standardized trilingual evaluation suite by
translating German benchmarks into Bavarian. Evaluations show that
Llama-GENBA-10B achieves strong cross-lingual performance, with the fine-tuned
variant surpassing Apertus-8B-2509 and gemma-2-9b in Bavarian and establishing
itself as the best model in its class for this language, while also
outperforming EuroLLM in English and matching its results in German. Training
on the Cerebras CS-2 demonstrated efficient large-scale multilingual
pretraining with documented energy use, offering a blueprint for inclusive
foundation models that integrate low-resource languages.

Source link

What's Hot

Forget Pixar – this new OpenAI-backed movie has a tiny budget and could change animation forever

MIT REAP Hualien Team Showcases Taiwan’s “Wellbeing Economy” at the AVPN Global Conference 2025

MongoDB boosts outlook on strong earnings and revenue momentum and its stock jumps

Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian – Takara TLDR

D-HUMOR: Dark Humor Understanding via Multimodal Open-ended Reasoning – Takara TLDR

MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI Agents – Takara TLDR

Why Language Models Hallucinate – Takara TLDR

Storied Collector and MoMA Trustee Dies at 92

Congress Obtains Drawing Trump Apparently Made for Jeffrey Epstein

Galerie Gmurzynska Slated to Open in New York’s Fuller Building

Woodmere Art Museum Drops Lawsuit Against Trump Administration

Forget Pixar – this new OpenAI-backed movie has a tiny budget and could change animation forever

MIT REAP Hualien Team Showcases Taiwan’s “Wellbeing Economy” at the AVPN Global Conference 2025

MongoDB boosts outlook on strong earnings and revenue momentum and its stock jumps

What's Hot

Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian – Takara TLDR

Related Posts

Subscribe to Updates