Browsing: Hugging Face
CLIPGaussians is a style transfer framework that supports text- and image-guided stylization of 2D images, videos, 3D objects, and 4D…
Evaluation of Large Reasoning Models in multilingual reasoning shows limited capability, with interventions improving readability but reducing accuracy. Recent Large…
ToMAP enhances LLM persuaders with Theory of Mind modules, improving opponent awareness and argument quality. Large language models (LLMs) have…
PatientSim generates diverse and realistic patient personas using clinical data to evaluate LLMs in medical dialogue settings. Doctor-patient consultations require…
Large Language Models (LLMs) generate functionally correct solutions but often fall short in code efficiency, a critical bottleneck for real-world…
MAGREF is a unified framework for video generation that uses masked guidance and dynamic masking for coherent multi-subject synthesis from…
Chain-of-thought (CoT) reasoning enables large language models (LLMs) to move beyond fast System-1 responses and engage in deliberative System-2 reasoning.…
Researchers propose a novel differentiable solver search algorithm that optimizes the computational efficiency and quality of diffusion models for image…
Re-ttention uses temporal redundancy in diffusion models to enable high sparse attention in visual generation, maintaining quality with minimal computational…
ViGoRL, a vision-language model enhanced with visually grounded reinforcement learning, achieves superior performance across various visual reasoning tasks by dynamically…