Browsing: Hugging Face

Hugging Face

Rethinking Reward Models for Multi-Domain Test-Time Scaling – Takara TLDR

Advanced AI EditorOctober 3, 2025

The reliability of large language models (LLMs) during test-time scaling is often assessed with \emph{external verifiers} or \emph{reward models} that…

Hugging Face

Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum – Takara TLDR

Advanced AI EditorOctober 2, 2025

Supervised fine-tuning (SFT) is the standard approach for post-training large language models (LLMs), yet it often shows limited generalization. We…

Hugging Face

GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness – Takara TLDR

Advanced AI EditorOctober 2, 2025

Graphical user interface (GUI) agents built on vision-language models have emerged as a promising approach to automate human-computer workflows. However,…

Hugging Face

On Predictability of Reinforcement Learning Dynamics for Large Language Models – Takara TLDR

Advanced AI EditorOctober 2, 2025

Recent advances in reasoning capabilities of large language models (LLMs) are largely driven by reinforcement learning (RL), yet the underlying…

Hugging Face

In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning – Takara TLDR

Advanced AI EditorOctober 2, 2025

Large language models (LLMs) are increasingly studied in the context of multi-turn reasoning, where models iteratively refine their outputs based…

Hugging Face

Making, not Taking, the Best of N – Takara TLDR

Advanced AI EditorOctober 2, 2025

Obtaining high-quality generations in modern LLMs has largely been framed as a selection problem: identifying a single winning generation from…

Hugging Face

GEM: A Gym for Agentic LLMs – Takara TLDR

Advanced AI EditorOctober 2, 2025

The training paradigm for large language models (LLMs) is moving from static datasets to experience-based learning, where agents acquire skills…

Hugging Face

Code2Video: A Code-centric Paradigm for Educational Video Generation – Takara TLDR

Advanced AI EditorOctober 2, 2025

While recent generative models advance pixel-space video synthesis, they remain limited in producing professional educational videos, which demand disciplinary knowledge,…

Hugging Face

BroRL: Scaling Reinforcement Learning via Broadened Exploration – Takara TLDR

Advanced AI EditorOctober 2, 2025

Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a key ingredient for unlocking complex reasoning capabilities in large language…

Hugging Face

OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always! – Takara TLDR

Advanced AI EditorOctober 2, 2025

Large Language Model (LLM) safety is one of the most pressing challenges for enabling wide-scale deployment. While most studies and…

What's Hot

MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs – Takara TLDR

Huawei Ascend Roadmap Could Challenge Nvidia AI Leadership

No More Pikachu Oppenheimer? OpenAI Promises Rightsholders More Control Over Sora Creations

Browsing: Hugging Face

Rethinking Reward Models for Multi-Domain Test-Time Scaling – Takara TLDR

Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum – Takara TLDR

GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness – Takara TLDR

On Predictability of Reinforcement Learning Dynamics for Large Language Models – Takara TLDR

In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning – Takara TLDR

Making, not Taking, the Best of N – Takara TLDR

GEM: A Gym for Agentic LLMs – Takara TLDR

Code2Video: A Code-centric Paradigm for Educational Video Generation – Takara TLDR

BroRL: Scaling Reinforcement Learning via Broadened Exploration – Takara TLDR

OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always! – Takara TLDR

Record Exec and Art Collector Gets Over 4 Years

Chicago’s Art Scene Offers a Beacon of Hope for Artists and Dealers

Pace to Close Hong Kong Gallery at H Queen’s This Month

Taylor Swift’s ‘Fate of Ophelia’ Has a Lot in Common with This Artwork

MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs – Takara TLDR

Huawei Ascend Roadmap Could Challenge Nvidia AI Leadership

No More Pikachu Oppenheimer? OpenAI Promises Rightsholders More Control Over Sora Creations

What's Hot

Browsing: Hugging Face

Subscribe to Updates