Browsing: Hugging Face
A new framework using draft models enhances approximate inference for long-context LLMs by better predicting token and key-value pair importance,…
Magistral, a scalable reinforcement learning pipeline, demonstrates that RL can enhance multimodal understanding and instruction following in large language models…
MoveGCL is a privacy-preserving framework using generative continual learning and a Mixture-of-Experts Transformer for training mobility foundation models without sharing…
Ming-Omni is a unified multimodal model with dedicated encoders and modality-specific routers that can process images, text, audio, and video,…
VerIF, a hybrid verification method combining rule-based and LLM-based approaches, enhances instruction-following RL with significant performance improvements and generalization. Reinforcement…
A causal representation learning framework identifies a concise causal structure to explain performance variations in language models across benchmarks by…
Paper page – PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
Abs: Generating aesthetic posters is more challenging than simple design images: it requires not only precise text rendering but also…
A framework for evaluating and optimizing natural language prompts in large language models is proposed, revealing correlations between prompt properties…
As automated attack techniques rapidly advance, CAPTCHAs remain a critical defense mechanism against malicious bots. However, existing CAPTCHA schemes encompass…
LaMP-Cap introduces a dataset for personalized figure caption generation using multimodal profiles to improve the quality of AI-generated captions. AI-generated…