Paper Page - Seed-Prover: Deep And Broad Reasoning For Automated Theorem Proving

Seed-Prover, a lemma-style reasoning model using Lean, achieves high performance in formal theorem proving and automated mathematical reasoning through iterative refinement and specialized geometry support.

LLMs have demonstrated strong mathematical reasoning abilities by leveraging
reinforcement learning with long chain-of-thought, yet they continue to
struggle with theorem proving due to the lack of clear supervision signals when
solely using natural language. Dedicated domain-specific languages like Lean
provide clear supervision via formal verification of proofs, enabling effective
training through reinforcement learning. In this work, we propose
Seed-Prover, a lemma-style whole-proof reasoning model. Seed-Prover
can iteratively refine its proof based on Lean feedback, proved lemmas, and
self-summarization. To solve IMO-level contest problems, we design three
test-time inference strategies that enable both deep and broad reasoning.
Seed-Prover proves 78.1% of formalized past IMO problems, saturates MiniF2F,
and achieves over 50\% on PutnamBench, outperforming the previous
state-of-the-art by a large margin. To address the lack of geometry support in
Lean, we introduce a geometry reasoning engine Seed-Geometry, which
outperforms previous formal geometry engines. We use these two systems to
participate in IMO 2025 and fully prove 5 out of 6 problems. This work
represents a significant advancement in automated mathematical reasoning,
demonstrating the effectiveness of formal verification with long
chain-of-thought reasoning.

Source link

What's Hot

UniVideo: Unified Understanding, Generation, and Editing for Videos – Takara TLDR

MIT rejects Trump administration’s higher education funding agreement

Reinforcing Diffusion Models by Direct Group Preference Optimization – Takara TLDR

Paper page – Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

UniVideo: Unified Understanding, Generation, and Editing for Videos – Takara TLDR

Reinforcing Diffusion Models by Direct Group Preference Optimization – Takara TLDR

Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency – Takara TLDR

The Rubin Names 2025 Art Prize, Research and Art Projects Grants

Kochi-Muziris Biennial Announces 66 Artists for December Exhibition

Instagram Launches ‘Rings’ Awards for Creators—With KAWS as a Judge

Frieze to Launch Abu Dhabi Fair in November 2026

UniVideo: Unified Understanding, Generation, and Editing for Videos – Takara TLDR

MIT rejects Trump administration’s higher education funding agreement

Reinforcing Diffusion Models by Direct Group Preference Optimization – Takara TLDR

What's Hot

Paper page – Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Related Posts

Subscribe to Updates