Browsing: Hugging Face
Rex-Thinker is a CoT-based model that enhances object referring by performing step-by-step reasoning over candidate objects, leading to improved interpretability…
Diffusion models have recently achieved impressive performance in various generative tasks, including object removal. However, existing image decomposition methods still…
Agentic AI systems, built on large language models (LLMs) and deployed in multi-agent configurations, are redefining intelligent autonomy, collaboration and…
A framework uses quantitative LLM judges to align existing LLM evaluation scores with human scores, improving predictive power and efficiency…
DINGO, a dynamic programming-based decoding strategy, enhances diffusion language models by enforcing structured output constraints, significantly improving performance on symbolic…
The Long CoT Collection dataset, generated by short CoT LLMs, enhances general reasoning skills and provides a strong foundation for…
LumosFlow uses LMTV-DM for key frame generation and LOF-DM followed by MotionControlNet for smooth intermediate frame interpolation, ensuring temporally coherent…
High-quality datasets are fundamental to training and evaluating machine learning models, yet their creation-especially with accurate human annotations-remains a significant…
Inspired by the in-context learning mechanism of large language models (LLMs), a new paradigm of generalizable visual prompt-based image editing…
ActiveKD integrates active learning with knowledge distillation using large vision-language models to efficiently select diverse, unlabeled samples for annotation. Knowledge…