Paper Page - Investigating Hallucination In Conversations For Low Resource Languages

The paper provides the first systematic hallucination evaluation of multilingual conversational LLM outputs (GPT-3.5, GPT-4o, Llama-3.1, Gemma-2.0, DeepSeek-R1, Qwen-3) across Hindi, Farsi, and Mandarin, revealing high hallucination in Hindi/Farsi versus minimal hallucination in Mandarin, and proposes benchmark-style evaluations using translated dialogue corpora.

➡️ 𝐊𝐞𝐲 𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬 𝐨𝐟 𝐨𝐮𝐫 𝐋𝐨𝐰-𝐑𝐞𝐬𝐨𝐮𝐫𝐜𝐞 𝐇𝐚𝐥𝐥𝐮𝐜𝐢𝐧𝐚𝐭𝐢𝐨𝐧 𝐁𝐞𝐧𝐜𝐡𝐦𝐚𝐫𝐤:

🧪 𝑴𝒖𝒍𝒕𝒊𝒍𝒊𝒏𝒈𝒖𝒂𝒍 𝑪𝒐𝒏𝒗𝒆𝒓𝒔𝒂𝒕𝒊𝒐𝒏𝒂𝒍 𝑯𝒂𝒍𝒍𝒖𝒄𝒊𝒏𝒂𝒕𝒊𝒐𝒏 𝑬𝒗𝒂𝒍𝒖𝒂𝒕𝒊𝒐𝒏:
Introduces a hallucination benchmark for three low-resource languages (Hindi, Farsi, Mandarin) using LLM-translated versions of BlendedSkillTalk and DailyDialog datasets, evaluating model responses against ROUGE-1 and ROUGE-L scores with human verification.

🧩 𝑪𝒐𝒎𝒑𝒂𝒓𝒂𝒕𝒊𝒗𝒆 𝑨𝒏𝒂𝒍𝒚𝒔𝒊𝒔 𝒂𝒄𝒓𝒐𝒔𝒔 𝑳𝑳𝑴 𝑭𝒂𝒎𝒊𝒍𝒊𝒆𝒔 𝒂𝒏𝒅 𝑳𝒂𝒏𝒈𝒖𝒂𝒈𝒆𝒔:
Finds that GPT-4o and GPT-3.5 outperform open-source models (LLaMA, Gemma, DeepSeek, Qwen) in minimizing hallucinations, especially in Mandarin; however, all models hallucinate more in Hindi and Farsi, indicating limitations of current LLMs under low-resource settings.

🧠 𝑹𝒆𝒔𝒐𝒖𝒓𝒄𝒆-𝑨𝒘𝒂𝒓𝒆 𝑯𝒂𝒍𝒍𝒖𝒄𝒊𝒏𝒂𝒕𝒊𝒐𝒏 𝑷𝒂𝒕𝒕𝒆𝒓𝒏𝒔 𝒂𝒏𝒅 𝑭𝒊𝒙𝒆𝒔:
Attributes hallucination differences to training data availability; proposes use of retrieval-augmented generation (RAG), grounded decoding, and language-specific fine-tuning to improve factuality in low-resource conversational agents, with native-speaker evaluation confirming hallucination types (partial vs. complete).

Source link

What's Hot

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions – Takara TLDR

AI Systems Can Be Fooled by Fake Dates, Giving Newer Content Unfair Visibility

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety – Takara TLDR

Paper page – Investigating Hallucination in Conversations for Low Resource Languages

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions – Takara TLDR

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety – Takara TLDR

SViM3D: Stable Video Material Diffusion for Single Image 3D Generation – Takara TLDR

The Rubin Names 2025 Art Prize, Research and Art Projects Grants

Kochi-Muziris Biennial Announces 66 Artists for December Exhibition

Instagram Launches ‘Rings’ Awards for Creators—With KAWS as a Judge

Museums Prepare to Close Their Doors as Government Shutdown Continues

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions – Takara TLDR

AI Systems Can Be Fooled by Fake Dates, Giving Newer Content Unfair Visibility

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety – Takara TLDR

What's Hot

Paper page – Investigating Hallucination in Conversations for Low Resource Languages

Related Posts

Subscribe to Updates