V-GameGym: Visual Game Generation For Code Large Language Models - Takara TLDR

Code large language models have demonstrated remarkable capabilities in
programming tasks, yet current benchmarks primarily focus on single modality
rather than visual game development. Most existing code-related benchmarks
evaluate syntax correctness and execution accuracy, overlooking critical
game-specific metrics such as playability, visual aesthetics, and user
engagement that are essential for real-world deployment. To address the gap
between current LLM capabilities in algorithmic problem-solving and competitive
programming versus the comprehensive requirements of practical game
development, we present V-GameGym, a comprehensive benchmark comprising 2,219
high-quality samples across 100 thematic clusters derived from real-world
repositories, adopting a novel clustering-based curation methodology to ensure
both diversity and structural completeness. Further, we introduce a multimodal
evaluation framework with an automated LLM-driven pipeline for visual code
synthesis using complete UI sandbox environments. Our extensive analysis
reveals that V-GameGym effectively bridges the gap between code generation
accuracy and practical game development workflows, providing quantifiable
quality metrics for visual programming and interactive element generation.

Source link

What's Hot

Discrete Diffusion for Reflective Vision-Language-Action Models in Autonomous Driving – Takara TLDR

OpenAI CEO Sam Altman Meets UAE President Sheikh Mohammed To Collaborate On AI Research

“The layoffs at Fiverr are just the beginning”: AI is coming for white-collar work

V-GameGym: Visual Game Generation for Code Large Language Models – Takara TLDR

Discrete Diffusion for Reflective Vision-Language-Action Models in Autonomous Driving – Takara TLDR

Thinking Augmented Pre-training – Takara TLDR

When Judgment Becomes Noise: How Design Failures in LLM Judge Benchmarks Silently Undermine Validity – Takara TLDR

Judge Rejects Ronald Perelman’s $400 M. Art Insurance Claim

Drag Queen Alexis Stone Became the Mona Lisa for Milan Fashion Show

Steve McQueen’s Granddaughter Lawsuit for $68 M. Pollock Painting

Marina Abramović to Have Exhibition at Venice’s Accademia in 2026

Discrete Diffusion for Reflective Vision-Language-Action Models in Autonomous Driving – Takara TLDR

OpenAI CEO Sam Altman Meets UAE President Sheikh Mohammed To Collaborate On AI Research

“The layoffs at Fiverr are just the beginning”: AI is coming for white-collar work

What's Hot

V-GameGym: Visual Game Generation for Code Large Language Models – Takara TLDR

Related Posts

Subscribe to Updates