DRISHTIKON: A Multimodal Multilingual Benchmark For Testing Language Models' Understanding On Indian Culture - Takara TLDR

We introduce DRISHTIKON, a first-of-its-kind multimodal and multilingual
benchmark centered exclusively on Indian culture, designed to evaluate the
cultural understanding of generative AI systems. Unlike existing benchmarks
with a generic or global scope, DRISHTIKON offers deep, fine-grained coverage
across India’s diverse regions, spanning 15 languages, covering all states and
union territories, and incorporating over 64,000 aligned text-image pairs. The
dataset captures rich cultural themes including festivals, attire, cuisines,
art forms, and historical heritage amongst many more. We evaluate a wide range
of vision-language models (VLMs), including open-source small and large models,
proprietary systems, reasoning-specialized VLMs, and Indic-focused models,
across zero-shot and chain-of-thought settings. Our results expose key
limitations in current models’ ability to reason over culturally grounded,
multimodal inputs, particularly for low-resource languages and less-documented
traditions. DRISHTIKON fills a vital gap in inclusive AI research, offering a
robust testbed to advance culturally aware, multimodally competent language
technologies.

Source link

What's Hot

Canadian A.I. Startup Cohere Valued at $7B After Raising Another $100M

Perplexity Comet AI web browser launches in India with a catch: Check how to download, setup and more – Technology News

Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications – Takara TLDR

DRISHTIKON: A Multimodal Multilingual Benchmark for Testing Language Models’ Understanding on Indian Culture – Takara TLDR

Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications – Takara TLDR

Reinforcement Learning on Pre-Training Data – Takara TLDR

What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT – Takara TLDR

Art Dealer Mary Boone Says Prison Was ‘Very Relaxing’

New Research Supports Theory of Hidden Vermeer Self-Portrait

John Singer Sargent Paintings Expected to Bring In $12-15 Million

John Giorno’s Decades-Long Project Dial-A-Poem Is Now Online

Canadian A.I. Startup Cohere Valued at $7B After Raising Another $100M

Perplexity Comet AI web browser launches in India with a catch: Check how to download, setup and more – Technology News

Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications – Takara TLDR

What's Hot

DRISHTIKON: A Multimodal Multilingual Benchmark for Testing Language Models’ Understanding on Indian Culture – Takara TLDR

Related Posts

Subscribe to Updates