The Hangzhou-based company introduced Qwen VLo, part of a series of AI services under the company’s Qwen brand. The new model is an upgrade from Qwen2.5-VL and is now able to generate text-to-image and image-to-image results. It also has a technology called progressive generation, meaning users can see the process as an image is created.
“This newly upgraded model not only ‘understands’ the world but also generates high-quality recreations based on that understanding,” the company said in a blog post. “You can directly send a prompt like ‘Generate a picture of a cute cat’ to generate an image or upload an image of a cat and ask ‘Add a cap on the cat’s head’ to modify an image.”
Best known for its ecommerce operations in China, Alibaba has been charging into AI and building standalone offerings around Qwen. In February, Chief Executive Officer Eddie Wu went so far as to say the company’s “primary objective” is now artificial general intelligence, a goal in the industry to build AI systems with human-level intellectual capabilities.
With the new Qwen multimodal model, it’s aiming to compete with a flurry of new visual interfaces in the market, including from OpenAI. It also faces aggressive domestic competition from the likes of DeepSeek.
After DeepSeek stunned the industry with a powerful model it said took just a few million dollars to build, China’s technology leaders flooded the market with a rapid succession of low-cost AI services. Alibaba has rapidly updated its Qwen series, adding new capabilities to process text, pictures, audio and video — with the efficiency to run directly on phones and laptops. It unveiled a new version of its AI assistant Quark app in March.