Browsing: arXiv AI

arXiv AI

R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning

Advanced AI EditorMay 23, 2025

arXiv:2505.17005v1 Announce Type: cross Abstract: Large Language Models (LLMs) are powerful but prone to hallucinations due to static knowledge. Retrieval-Augmented…

arXiv AI

Latent Principle Discovery for Language Model Self-Improvement

Advanced AI EditorMay 23, 2025

arXiv:2505.16927v1 Announce Type: cross Abstract: When language model (LM) users aim to improve the quality of its generations, it is…

arXiv AI

Self-Evolving Curriculum for LLM Reasoning

Advanced AI EditorMay 23, 2025

arXiv:2505.14970v1 Announce Type: new Abstract: Reinforcement learning (RL) has proven effective for fine-tuning large language models (LLMs), significantly enhancing their…

arXiv AI

ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges

Advanced AI EditorMay 23, 2025

arXiv:2505.15068v1 Announce Type: new Abstract: Recent progress in large language models (LLMs) has enabled substantial advances in solving mathematical problems.…

arXiv AI

HAVA: Hybrid Approach to Value-Alignment through Reward Weighing for Reinforcement Learning

Advanced AI EditorMay 22, 2025

arXiv:2505.15011v1 Announce Type: new Abstract: Our society is governed by a set of norms which together bring about the values…

arXiv AI

[2505.01731] Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models

Advanced AI EditorMay 22, 2025

[Submitted on 3 May 2025 (v1), last revised 21 May 2025 (this version, v3)] View a PDF of the paper…

arXiv AI

Cost-Efficient Continual Learning via Weight Space Consolidation

Advanced AI EditorMay 22, 2025

[Submitted on 11 Feb 2025 (v1), last revised 20 May 2025 (this version, v3)] View a PDF of the paper…

arXiv AI

[2505.15754] Improving planning and MBRL with temporally-extended actions

Advanced AI EditorMay 22, 2025

[Submitted on 21 May 2025] View a PDF of the paper titled Improving planning and MBRL with temporally-extended actions, by…

arXiv AI

Second-Order Convergence in Private Stochastic Non-Convex Optimization

Advanced AI EditorMay 22, 2025

arXiv:2505.15647v1 Announce Type: cross Abstract: We investigate the problem of finding second-order stationary points (SOSP) in differentially private (DP) stochastic…

arXiv AI

Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification

Advanced AI EditorMay 22, 2025

[Submitted on 29 Apr 2025 (v1), last revised 21 May 2025 (this version, v2)] View a PDF of the paper…

What's Hot

How Text, Image, and Video Models Are Converging to Transform Intelligence

When and What: Diffusion-Grounded VideoLLM with Entity Aware Segmentation for Long Video Understanding – Takara TLDR

Claude wins high praise from a Supreme Court justice – is AI’s legal losing streak over?

Browsing: arXiv AI

R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning

Latent Principle Discovery for Language Model Self-Improvement

Self-Evolving Curriculum for LLM Reasoning

ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges

HAVA: Hybrid Approach to Value-Alignment through Reward Weighing for Reinforcement Learning

[2505.01731] Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models

Cost-Efficient Continual Learning via Weight Space Consolidation

[2505.15754] Improving planning and MBRL with temporally-extended actions

Second-Order Convergence in Private Stochastic Non-Convex Optimization

Advancing Radiology Foundation Models with Reasoning through Step-by-Step Verification

White House Targets Specific Artworks at Smithsonian Museums

French Art Historian Trying to Block Bayeux Tapestry’s Move to London

Czech Man Sues Christie’s For Information on Nazi-Looted Artworks

Tanya Bonakdar Gallery to Close Los Angeles Space

How Text, Image, and Video Models Are Converging to Transform Intelligence

When and What: Diffusion-Grounded VideoLLM with Entity Aware Segmentation for Long Video Understanding – Takara TLDR

Claude wins high praise from a Supreme Court justice – is AI’s legal losing streak over?

What's Hot

Browsing: arXiv AI

Subscribe to Updates