D-HUMOR: Dark Humor Understanding Via Multimodal Open-ended Reasoning - Takara TLDR

Dark humor in online memes poses unique challenges due to its reliance on
implicit, sensitive, and culturally contextual cues. To address the lack of
resources and methods for detecting dark humor in multimodal content, we
introduce a novel dataset of 4,379 Reddit memes annotated for dark humor,
target category (gender, mental health, violence, race, disability, and other),
and a three-level intensity rating (mild, moderate, severe). Building on this
resource, we propose a reasoning-augmented framework that first generates
structured explanations for each meme using a Large Vision-Language Model
(VLM). Through a Role-Reversal Self-Loop, VLM adopts the author’s perspective
to iteratively refine its explanations, ensuring completeness and alignment. We
then extract textual features from both the OCR transcript and the self-refined
reasoning via a text encoder, while visual features are obtained using a vision
transformer. A Tri-stream Cross-Reasoning Network (TCRNet) fuses these three
streams, text, image, and reasoning, via pairwise attention mechanisms,
producing a unified representation for classification. Experimental results
demonstrate that our approach outperforms strong baselines across three tasks:
dark humor detection, target identification, and intensity prediction. The
dataset, annotations, and code are released to facilitate further research in
multimodal humor understanding and content moderation. Code and Dataset are
available at:
https://github.com/Sai-Kartheek-Reddy/D-Humor-Dark-Humor-Understanding-via-Multimodal-Open-ended-Reasoning

Source link

What's Hot

Interleaving Reasoning for Better Text-to-Image Generation – Takara TLDR

Powering innovation at scale: How AWS is tackling AI infrastructure challenges

Mistral AI, Backed by NVIDIA, Raises $2 Billion at $14 Billion Valuation

D-HUMOR: Dark Humor Understanding via Multimodal Open-ended Reasoning – Takara TLDR

Interleaving Reasoning for Better Text-to-Image Generation – Takara TLDR

R^textbf{2AI}: Towards Resistant and Resilient AI in an Evolving World – Takara TLDR

Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents – Takara TLDR

Anne Imhof Reimagines Football Jerseys with Nike

Storied Collector and MoMA Trustee Dies at 92

Congress Obtains Drawing Trump Apparently Made for Jeffrey Epstein

Galerie Gmurzynska Slated to Open in New York’s Fuller Building

Interleaving Reasoning for Better Text-to-Image Generation – Takara TLDR

Powering innovation at scale: How AWS is tackling AI infrastructure challenges

Mistral AI, Backed by NVIDIA, Raises $2 Billion at $14 Billion Valuation

What's Hot

D-HUMOR: Dark Humor Understanding via Multimodal Open-ended Reasoning – Takara TLDR

Related Posts

Subscribe to Updates