On September 6, Zhiyuan Technology reported that last night, Alibaba launched the Preview version of its strongest model in the Qwen3 series, Qwen3-Max, which is also Alibaba’s largest model to date with over 1 trillion parameters. This model is now available on the Alibaba Bailian platform and can be used for free in the Tongyi Qianwen application and Qwen Chat.
According to the Bailian platform, Qwen3-Max-Preview shows significant improvements over the Qwen2.5 series in overall general capabilities, including Chinese and English text understanding, complex instruction following, subjective open task capabilities, multilingual capabilities, and tool invocation abilities; the model has fewer knowledge hallucinations.
Just yesterday, the official Qwen X account teased the upcoming launch of the most powerful and intelligent member of the Qwen3 family. Today, the model has officially gone live, and its evaluation results have been released.
It is reported that Qwen3-Max-Preview surpassed Claude-Opus 4 (Non-Thinking), as well as Kimi-K2, DeepSeek-V3.1, and Alibaba’s previous open-source best model Qwen3-235B-A22B-Instruct-2507 in evaluations of general knowledge (SuperGPQA), mathematical reasoning (AIME25), programming (LiveCodeBench v6), human preference alignment (Arena-Hard v2), and comprehensive capability assessment (LiveBench).
On the AI model aggregation platform OpenRoute, the introduction of Qwen3-Max mentions significant improvements in reasoning, instruction execution, multilingual support, and coverage of long-tail knowledge; it also provides higher accuracy in mathematical, programming, logical, and scientific tasks. The model supports over 100 languages, has enhanced translation and common sense reasoning capabilities, and is optimized for retrieval-augmented generation (RAG) and tool invocation, but does not include a specialized “thinking” mode.
Zhiyuan Technology quickly experienced Qwen3-Max-Preview on the Tongyi Qianwen web platform and found that the model performed excellently in text understanding as well as in mathematical and programming abilities, with very fast response times.
First, we asked Qwen3-Max-Preview to generate a small ball collision simulator, and we input the prompt:
“There are two small balls inside a circle, one black and one white. The white ball falls randomly and bounces off the boundaries, while a second white ball is generated at a random position. The black ball also bounces off the boundaries and grows slightly when it collides with the white ball. Please simulate this.”
Qwen3-Max-Preview quickly produced this program, simulating the motion of the two types of balls, ultimately resulting in the black ball expanding to consume the white ball.
When we increased the difficulty and asked Qwen3-Max-Preview to simulate a strength versus speed population, continually optimizing this simulator, we found that Qwen3-Max-Preview could achieve rapid and accurate simulations, completing tasks that a seasoned programmer might take half a day to finish within seconds.
We input the prompt: “There are two populations, population A focuses on developing strength, while population B focuses on developing speed. Please simulate the interaction between these two populations and provide an explanation.”
As shown in the figure below, even though the prompt was very vague, Qwen3-Max-Preview still understood my intention and provided a relatively accurate simulation.
In the above simulation, I noticed that the speed-focused population was eliminated too quickly, so I further hoped they could have an “escape” ability. I input the prompt: “The speed-focused population is eliminated too quickly; each individual should possess some ability to evade danger.”
Subsequently, Qwen3-Max-Preview output the following “Strength and Speed Population Simulation (Enhanced Version)” and accurately simulated small balls with evasion capabilities, resulting in a situation where “no one can eliminate anyone.”
Only being able to run away without retaliating, they would eventually be eliminated. Thus, I requested that the speed-focused population have cooperative offensive capabilities, inputting the prompt:
“When the speed-focused population unites, they can eliminate individual strength-focused individuals; please add this ability and simulate again.”
Qwen3-Max-Preview was still able to implement this well, outputting the “Strength and Speed Population Simulation (Cooperative Version)” and simulating that the small green balls could resist the red balls after gaining cooperative abilities, but both sides remained in a stalemate.
As the simulation progressed, both populations dwindled, so we further asked Qwen3-Max-Preview to provide them with the ability to reproduce, inputting the prompt:
“After they eliminate individuals from the opposing side, they can accumulate nutrients and reproduce themselves; continue the simulation.”
Thus, Qwen3-Max-Preview output the “Strength and Speed Population Simulation (Resource and Reproduction Version)”; from the simulation, we can see that both types of balls began to split, and under these circumstances, the red balls could no longer compete with the green balls.
Then, I input:
“The strength-focused population is too weak; they cannot catch their opponents. Please provide them with team collaboration abilities to encircle speed-focused individuals.”
Qwen3-Max-Preview output the “Strength and Speed Population Simulation (Bidirectional Cooperative Version)”; the small green and red balls formed a tendency to cluster, resulting in a situation of “group brawling and siege” on both sides.
Through this interesting little experiment, we found that Qwen3-Max-Preview can smoothly understand user intent even when the prompt is quite vague.
In particular, expressions such as “evade danger,” “unity,” “cooperation,” and “reproduction” are relatively abstract, and their corresponding actual meanings are complex, involving many adjustable parameters. Yet, Qwen3-Max-Preview accurately understood the semantics and underlying logic within seconds and completed the programming for the simulation experiment, showcasing its outstanding capabilities in complex reasoning, instruction execution, mathematics, and programming.
According to the Bailian platform, in terms of pricing, Qwen3-Max-Preview supports 256k context and adopts a tiered pricing model based on the number of input tokens:
Input 0-32k tokens price: 0.006 yuan per thousand input tokens, 0.024 yuan per thousand output tokens.
Input 32k-128k tokens price: 0.01 yuan per thousand input tokens, 0.04 yuan per thousand output tokens.
Input 128k-252k tokens price: 0.015 yuan per thousand input tokens, 0.06 yuan per token output.
In comparison, Qwen-Max-0919 has a price of 0.02 yuan per thousand input tokens and 0.06 yuan per thousand output tokens; Qwen3-Max-Preview offers a more layered pricing structure, with higher performance at a more affordable price.
Experience address:
https://chat.qwen.ai
Alibaba Cloud Bailian API Service:
https://bailian.console.aliyun.com/?tab=model#/model-market
Conclusion: The Super Large Qwen3 Model Proves the Effect of Scaled Expansion
The breakthrough in model layers is becoming Alibaba’s first trump card in its AI transformation. In internal testing and early user evaluations, Qwen3-Max-Preview has demonstrated a broader knowledge base and superior conversational abilities, with stronger performance in Agent tasks and instruction following.
The open-source and closed-source approaches of Tongyi Qianwen’s large model have already represented a new height in China’s large model technology. Qwen3-Max-Preview has set a new record for the parameters of Alibaba’s large models, attempting to prove the effect of scaled expansion with even more powerful performance—larger models possess stronger capabilities.
Source: Bailian Platform, X Platform返回搜狐,查看更多