arXiv AI

[2504.13120] Probing and Inducing Combinational Creativity in Vision-Language Models

By Advanced AI EditorApril 30, 2025No Comments2 Mins Read

[Submitted on 17 Apr 2025 (v1), last revised 29 Apr 2025 (this version, v2)]

View a PDF of the paper titled Probing and Inducing Combinational Creativity in Vision-Language Models, by Yongqian Peng and 7 other authors

View PDF
HTML (experimental)

Abstract:The ability to combine existing concepts into novel ideas stands as a fundamental hallmark of human intelligence. Recent advances in Vision-Language Models (VLMs) like GPT-4V and DALLE-3 have sparked debate about whether their outputs reflect combinational creativity–defined by M. A. Boden (1998) as synthesizing novel ideas through combining existing concepts–or sophisticated pattern matching of training data. Drawing inspiration from cognitive science, we investigate the combinational creativity of VLMs from the lens of concept blending. We propose the Identification-Explanation-Implication (IEI) framework, which decomposes creative processes into three levels: identifying input spaces, extracting shared attributes, and deriving novel semantic implications. To validate this framework, we curate CreativeMashup, a high-quality dataset of 666 artist-generated visual mashups annotated according to the IEI framework. Through extensive experiments, we demonstrate that in comprehension tasks, best VLMs have surpassed average human performance while falling short of expert-level understanding; in generation tasks, incorporating our IEI framework into the generation pipeline significantly enhances the creative quality of VLMs’ outputs. Our findings establish both a theoretical foundation for evaluating artificial creativity and practical guidelines for improving creative generation in VLMs.

Submission history

From: Yongqian Peng [view email]
[v1]
Thu, 17 Apr 2025 17:38:18 UTC (38,950 KB)
[v2]
Tue, 29 Apr 2025 14:51:47 UTC (38,950 KB)

Previous ArticleNYC wants AI subway cameras to predict ‘trouble’ before it happens

Next Article OpenAI’s Transition to For-Profit: What It Means for AI Ethics

Advanced AI Editor

Leave A Reply