The result? It achieved a gold medal-level score.
Thang, 38, is a senior researcher in the artificial intelligence research division at Google DeepMind. Over nearly a decade at the company he has contributed to several AI models, including Gemini.
Deep Think, an advanced reasoning system integrated into Gemini, was developed by a team comprising dozens of experts led by Thang. It is designed to solve complex problems like those in the IMO. Like human contestants, it completed two 4.5-hour tests without internet access or external tools. The system scored 35 out of 42 points across the first five problems, graded by IMO judges. At this year’s competition, only 67 out of 630 contestants reached 35 points or higher, the gold medal threshold.
-What motivated your group to build an AI capable of solving mathematics?”
The idea came naturally. At the end of 2022 ChatGPT was launched. In early 2023 we released Bard after 100 days of development, later renamed Gemini in 2024.
Both ChatGPT and Gemini excelled at natural language, like writing poetry or love letters, but they mainly imitated human expression. We then asked: why not give Gemini higher reasoning ability? Mathematics was the first area that came to mind.
I have loved mathematics since childhood and once competed at the national level in high school. In 2024 I founded a new team at Google called “Superintelligence Reasoning.” The aim was to enable AI to reason deeply and reduce “hallucinations” – when models generate incorrect but seemingly plausible information.
Mathematics was chosen to force AI to reason step by step with rigor. Google already had AlphaGeometry 2 performing at gold medalist level and AlphaProof at silver level. But both combined neural networks with logic systems, meaning humans still had to be involved at certain stages.
Deep Think, in contrast, handled nearly the entire process independently using parallel reasoning. This allowed it to explore and combine multiple solution paths simultaneously before producing an answer unlike conventional linear reasoning models. It was trained with new reinforcement learning techniques on datasets involving multi-step reasoning, problem-solving and theorem proving.
Luong Minh Thang, 38, is a senior researcher in the artificial intelligence research division at Google DeepMind. Photo by VnExpress/ Bao Lam
-What kind of data was used for Deep Think?
We gathered a very large dataset with multi-step solutions, including global math competitions. Earlier models used mainly final answers, but that does not reflect rigorous reasoning.
The data volume cannot be disclosed, but it was vast. We also worked with about 30 former medalists from mathematics competitions worldwide, including individuals in Vietnam such as Tran Nam Dung, vice principal of the High School for the Gifted at the Vietnam National University in HCMC.
-What challenges or surprises did your team encounter as deadlines approached?
Just before IMO 2025 we went through what I call “four magical days” with Deep Think. We already had a strong system, but we wanted to make it even better.
That required changing configurations and using much more computing power, which Google then lacked. I persuaded three other major teams and senior leaders, including DeepMind CEO Demis Hassabis, by presenting evidence that Deep Think would deliver stronger performance. They immediately approved more resources.
Still, we had just one week to implement the new training algorithm. For four days in July we worked under extreme pressure thinking “YOLO – You Only Live Once.” In earlier models, formulas were processed separately. In the new version, four core formulas were grouped together. If it failed, everything could have collapsed.
But the results exceeded expectations. Deep Think not only solved IMO 2025 problems at gold level but also showed abilities in self-programming and advanced knowledge retrieval, a capability that many other models either lack or have not achieved at comparable strength.
-OpenAI also claimed its AI reached gold medal level at IMO 2025 with a similar score. What was the difference?
OpenAI announced its result on July 19, while we did two days later. In fact, Deep Think finished on July 14, but we waited until July 21 to respect the IMO 2025 competition which ended a day prior in Australia. OpenAI said its model achieved gold level but did not publish the reasoning process, nor was its result graded by IMO judges.
Deep Think, however, displayed its entire reasoning step by step and was officially graded. More importantly, by then Google’s model had already been released for public use for two weeks.
The Google team behind the development of the Deep Think AI system. Photo courtesy of Luong Minh Thang
-If Deep Think were a student, how would you describe it?
Months ago it was like a diligent student, carefully checking every possibility before giving an answer. Now it is a creative student, coming up with new methods that surprise even us.
-Some worry that specialized models for writing or solving math may make students lazy. What is your view?
That is a real concern. Any technology can be a double-edged sword. If students rely on AI too much, they may lose critical thinking. But creative students can use AI to inspire ideas they might not think of on their own.
Vietnam’s education also needs change. In the AI era, knowledge becomes outdated within months. We must adopt flexible teaching approaches. I see young Vietnamese as unpolished gems. For example, many have strong mathematical foundations but lack long-term mentorship. When I first went to Singapore, I thought I was creative, but in reality, I was small. In the U.S., I felt even smaller and realized how important dreams, environments and mentors are.
To support young Vietnamese, I co-founded the nonprofit New Turing Institute with two Silicon Valley colleagues. Its goal is to nurture and inspire the next generation of AI talent in Southeast Asia through training, funding and competitions. Our motto is “Zero to Hero” – turning someone from knowing nothing into someone who can master knowledge and pass it on. We also want to bring leading global experts to Vietnam to train and inspire local youth.
-What are your predictions for AI and the fast-growing field of humanoid robots in the next decade, particularly for young people planning their careers?
The future is hard to predict because technology evolves so fast. I once thought it would take long for AI to solve IMO problems, but it has happened already.
One certainty is that AI will make major contributions to science, medicine and research. With advanced reasoning, AI can discover new mathematical laws, new physics theories and new medicines.
As for humanoid robots, I have some concerns because they are not just algorithms but physical machines. But I lean toward optimism.