Browsing: Hugging Face
Motivated by scaling laws in language modeling that demonstrate how test loss scales as a power law with model and…
This work presents Prior Depth Anything, a framework that combines incomplete but precise metric information in depth measurement with relative…
Universal visual anomaly detection aims to identify anomalies from novel or unseen vision domains without additional fine-tuning, which is critical…
Unifying image understanding and generation has gained growing attention in recent research on multimodal models. Although design choices for image…
We present Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning. Seed1.5-VL is composed with a…
While generative artificial intelligence has advanced significantly across text, image, audio, and video domains, 3D generation remains comparatively underdeveloped due…
PAPER – REFINE-AF: A Task-Agnostic Framework to Align Language Models via Self-Generated Instructions using Reinforcement Learning from Automated Feedback AUTHORS…
LLM‑based agents have demonstrated great potential in generating and managing code within complex codebases. In this paper, we introduce WebGen-Bench,…
Recently, there has been growing interest in collecting reasoning-intensive pretraining data to improve LLMs’ complex reasoning ability. Prior approaches typically…
Retrieval-augmented generation (RAG) is a common strategy to reduce hallucinations in Large Language Models (LLMs). While reinforcement learning (RL) can…