View a PDF of the paper titled NativQA: Multilingual Culturally-Aligned Natural Query for LLMs, by Md. Arid Hasan and 8 other authors
View PDF
HTML (experimental)
Abstract:Natural Question Answering (QA) datasets play a crucial role in evaluating the capabilities of large language models (LLMs), ensuring their effectiveness in real-world applications. Despite the numerous QA datasets that have been developed and some work has been done in parallel, there is a notable lack of a framework and large scale region-specific datasets queried by native users in their own languages. This gap hinders the effective benchmarking and the development of fine-tuned models for regional and cultural specificities. In this study, we propose a scalable, language-independent framework, NativQA, to seamlessly construct culturally and regionally aligned QA datasets in native languages, for LLM evaluation and tuning. We demonstrate the efficacy of the proposed framework by designing a multilingual natural QA dataset, MultiNativQA, consisting of ~64k manually annotated QA pairs in seven languages, ranging from high to extremely low resource, based on queries from native speakers from 9 regions covering 18 topics. We benchmark open- and closed-source LLMs with the MultiNativQA dataset. We made the MultiNativQA dataset(this https URL), and other experimental scripts(this https URL) publicly available for the community.
Submission history
From: Firoj Alam [view email]
[v1]
Sat, 13 Jul 2024 09:34:00 UTC (4,332 KB)
[v2]
Sun, 6 Oct 2024 10:46:41 UTC (6,266 KB)
[v3]
Fri, 30 May 2025 14:06:34 UTC (2,741 KB)