
Alibaba, the company behind the Qwen AI platform, has recently lifted the lid on Qwen-Image.
The team at Qwen described Qwen-Image in a recent blog post by saying: “We are thrilled to release Qwen-Image, a 20B MMDiT image foundation model that achieves significant advances in complex text rendering and precise image editing.”
The model demonstrates strong performance in embedding multilingual text, including Chinese and English, while accurately generating visuals based on complex user instructions.
Demonstrating the platform’s capabilities
To showcase Qwen-Image’s range, the Qwen team asked the generative AI platform to create various images, each with granular instructions and complex requests.
First, the team asked Qwen-Image to create an image based on the anime art style of Hayao Miyazaki. The model successfully replicated Miyazaki’s distinct aesthetic while following the provided instructions.
After trying out a few different designs with both English and Chinese text prompts, the development team tested Qwen-Image’s ability to handle complicated, multi-step instructions. In one test, the model effectively produced bilingual outputs, embedding both English and Chinese text in a single image layout.
These early demonstrations highlighted how Qwen-Image can be used to create cartoonish art, realistic imagery, infographics, posters, and more.
Comparing the Qwen-Image’s performance against other AI models
Qwen-Image’s performance was also compared directly against other AI companies’ models in a variety of common benchmarks.
(Source: Qwen-Image official benchmark report)Qwen-ImageGPT Image 1 (High)Seedream 3.0LongText-Bench (ZH)0.9460.6190.878LongText-Bench (EN)0.9430.9560.896Chinese Word (ZH)0.5830.3610.331TextCraft (EN)0.8290.8570.592One-IG-Bench-Test (ZH)0.9630.6500.928One-IG-Bench-Test (EN)0.8910.8570.865
While Qwen-Image is the clear winner on most benchmarks, it falls behind GPT Image 1 when it comes to rendering text in English. Nonetheless, given Alibaba’s strong domestic focus, Qwen-Image’s top-tier Chinese text performance further strengthens its appeal for users in multilingual environments.
Releasing Qwen-Image in a highly competitive landscape
Although the developers of Qwen-Image have already proven their platform’s artistic capabilities, they’re entering a highly competitive space of generative AI. Major players such as OpenAI’s DALL-E, Midjourney, Canva, Adobe Firefly, and Stable Diffusion currently dominate the visual AI market.
It remains to be seen how Qwen-Image will stack up against these established tools, particularly in regions and industries that benefit from bilingual support and open licensing.
Curious how AI is powering the next generation of digital creators? Dive into our roundup of free AI art tools making it easier than ever to bring your ideas to life.