[2411.00863] Next-Token Prediction Task Assumes Optimal Data Ordering For LLM Training In Proof Generation

[Submitted on 30 Oct 2024 (v1), last revised 3 Jul 2025 (this version, v2)]

Authors:Chenyang An, Shima Imani, Feng Yao, Chengyu Dong, Ali Abbasi, Harsh Shrivastava, Samuel Buss, Jingbo Shang, Gayathri Mahalingam, Pramod Sharma, Maurice Diesendruck

View a PDF of the paper titled Next-Token Prediction Task Assumes Optimal Data Ordering for LLM Training in Proof Generation, by Chenyang An and 10 other authors

View PDF
HTML (experimental)

Abstract:In the field of large language model (LLM)-based proof generation, despite extensive training on large datasets such as ArXiv, LLMs still exhibit only modest performance on proving tasks of moderate difficulty. We believe that this is partly due to the widespread presence of suboptimal ordering within the data for each proof used in training. For example, published proofs often follow a purely logical order, where each step logically proceeds from the previous steps based on the deductive rules. This order is designed to facilitate the verification of the proof’s soundness, rather than to help people and models learn the discovery process of the proof. In proof generation, we argue that the optimal order for one training data sample occurs when the relevant intermediate supervision for a particular proof step in the proof is always positioned to the left of that proof step. We call such order the intuitively sequential order. We validate our claims using two tasks: intuitionistic propositional logic theorem-proving and digit multiplication. Our experiments verify the order effect and provide support for our explanations. We demonstrate that training is most effective when the proof is in the intuitively sequential order. Moreover, the order effect and the performance gap between models trained on different data orders can be substantial — with an 11 percent improvement in proof success rate observed in the propositional logic theorem-proving task, between models trained on the optimal order compared to the worst order. Lastly, we define a common type of order issue in advanced math proofs and find that 17.3 percent of theorems with nontrivial proofs in the first two chapters of a widely used graduate-level mathematics textbook suffer from this issue. A detailed list of those proofs is provided in the appendix.

Submission history

From: Chenyang An [view email]
[v1]
Wed, 30 Oct 2024 18:00:04 UTC (308 KB)
[v2]
Thu, 3 Jul 2025 15:14:51 UTC (271 KB)

Source link

13 Comments

cargo-ex-589 on September 7, 2025 1:59 am

карго из китая в россию доставка карго
russkoe-porno-408 on September 8, 2025 9:47 am

русское порно бесплатно смотреть русское порно
porno-903 on September 8, 2025 10:17 am

Want to have fun? porno bangladesh melbet Watch porn, buy heroin or ecstasy. Pick up whores or buy marijuana. Come in, we’re waiting
promocod-iherb-351 on September 8, 2025 10:42 am

Новые актуальные промокод iherb для выгодных покупок! Скидки на витамины, БАДы, косметику и товары для здоровья. Экономьте до 30% на заказах, используйте проверенные купоны и наслаждайтесь выгодным шопингом.
kursovaya-rabota-839 on September 13, 2025 10:29 am

решение курсовых заказать курсовую в москве
onlayn zaym 563 on September 13, 2025 11:04 am

займ онлайн быстрый займ онлайн
zaym onlayn 247 on September 13, 2025 12:06 pm

займ на карту онлайн мгновенно займы онлайн на карту без проверок
byuro-perevodov-642 on September 16, 2025 2:12 pm

нотариус перевод документов услуги бюро переводов
cocaine-prague-980 on September 20, 2025 7:51 pm

buy drugs in prague cocaine prague
cocaine-prague-18 on September 20, 2025 8:32 pm

buy coke in telegram cocain in prague from columbia
prague-cocaine-411 on September 21, 2025 10:35 am

prague drugstore cocain in prague fishscale
snow-market-18 on September 21, 2025 5:57 pm

plug in prague cocain in prague from columbia
joszaki-237 on September 30, 2025 12:33 am

joszaki regisztracio https://joszaki.hu/

What's Hot

GCPO: When Contrast Fails, Go Gold – Takara TLDR

I’m fed up of AI chatbots replacing customer service

C3.AI DEADLINE FOR LEADERSHIP is October 21, 2025 in a Securities Fraud Lawsuit – Contact Kaplan Fox & Kilsheimer LLP

[2411.00863] Next-Token Prediction Task Assumes Optimal Data Ordering for LLM Training in Proof Generation

LTLCrit: A Temporal Logic-based LLM Critic for Safe and Efficient Embodied Agents

From Imitation to Innovation: The Emergence of AI Unique Artistic Styles and the Challenge of Copyright Protection

VerifyLLM: LLM-Based Pre-Execution Task Plan Verification for Robots

13 Comments

The Rubin Names 2025 Art Prize, Research and Art Projects Grants

Kochi-Muziris Biennial Announces 66 Artists for December Exhibition

Instagram Launches ‘Rings’ Awards for Creators—With KAWS as a Judge

Museums Prepare to Close Their Doors as Government Shutdown Continues

GCPO: When Contrast Fails, Go Gold – Takara TLDR

I’m fed up of AI chatbots replacing customer service

C3.AI DEADLINE FOR LEADERSHIP is October 21, 2025 in a Securities Fraud Lawsuit – Contact Kaplan Fox & Kilsheimer LLP

What's Hot

[2411.00863] Next-Token Prediction Task Assumes Optimal Data Ordering for LLM Training in Proof Generation

Submission history

Related Posts

13 Comments

Subscribe to Updates