We-Math 2.0: A Versatile MathBook System For Incentivizing Visual Mathematical Reasoning - Takara TLDR

Multimodal Large Language Models (MLLMs) have demonstrated impressive
capabilities across various tasks, but still struggle with complex mathematical
reasoning. Existing research primarily focuses on dataset construction and
method optimization, often overlooking two critical aspects: comprehensive
knowledge-driven design and model-centric data space modeling. In this paper,
we introduce We-Math 2.0, a unified system that integrates a structured
mathematical knowledge system, model-centric data space modeling, and a
reinforcement learning (RL)-based training paradigm to comprehensively enhance
the mathematical reasoning abilities of MLLMs. The key contributions of We-Math
2.0 are fourfold: (1) MathBook Knowledge System: We construct a five-level
hierarchical system encompassing 491 knowledge points and 1,819 fundamental
principles. (2) MathBook-Standard & Pro: We develop MathBook-Standard, a
dataset that ensures broad conceptual coverage and flexibility through dual
expansion. Additionally, we define a three-dimensional difficulty space and
generate 7 progressive variants per problem to build MathBook-Pro, a
challenging dataset for robust training. (3) MathBook-RL: We propose a
two-stage RL framework comprising: (i) Cold-Start Fine-tuning, which aligns the
model with knowledge-oriented chain-of-thought reasoning; and (ii) Progressive
Alignment RL, leveraging average-reward learning and dynamic data scheduling to
achieve progressive alignment across difficulty levels. (4) MathBookEval: We
introduce a comprehensive benchmark covering all 491 knowledge points with
diverse reasoning step distributions. Experimental results show that
MathBook-RL performs competitively with existing baselines on four widely-used
benchmarks and achieves strong results on MathBookEval, suggesting promising
generalization in mathematical reasoning.

Source link

What's Hot

3 strategies to retain your entry-level employees

Relativity Launches Rel Labs – Will Invest In Startups – Artificial Lawyer

How Confident are Video Models? Empowering Video Models to Express their Uncertainty – Takara TLDR

We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning – Takara TLDR

How Confident are Video Models? Empowering Video Models to Express their Uncertainty – Takara TLDR

SurveyBench: How Well Can LLM(-Agents) Write Academic Surveys? – Takara TLDR

SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus – Takara TLDR

Morning Links for October 6, 2025

Sotheby’s to Sell René Magritte Held in Same Collection for 100 years

Former ARTnews Publisher Dies at 97

National Gallery of Art Closes as a Result of Government Shutdown

3 strategies to retain your entry-level employees

Relativity Launches Rel Labs – Will Invest In Startups – Artificial Lawyer

How Confident are Video Models? Empowering Video Models to Express their Uncertainty – Takara TLDR

What's Hot

We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning – Takara TLDR

Related Posts

Subscribe to Updates