Two AI systems developed by Google DeepMind made history this week, earning silver medal-level scores at the International Mathematical Olympiad (IMO). AlphaProof and AlphaGeometry 2, the two systems, collaborated to solve six challenging problems in the prestigious competition.
It is worth noting that the combined system solved four out of six problems, earning 28 points out of a possible 42, just one point short of the gold threshold.
Notably, the AI achieved a perfect score on the most difficult problem in the competition, which only five human competitors were able to solve. This remarkable performance places DeepMind’s AI among the best young mathematical minds in the world.
“What it does is way beyond what a pure brute-force search would be capable of, so there’s clearly something interesting going on when it works,” said Professor Timothy Gowers, a Fields Medallist and former IMO gold winner, who noted the AI’s responses.
Different Approaches to Problem Solving
The two systems used distinct approaches. AlphaProof, a language model combined with reinforcement learning, tackled two algebra problems and one number theory problem. It relies on “formal mathematics” to write verifiable mathematical proofs as programs, allowing the system to learn and improve.
AlphaGeometry 2, on the other hand, focused on geometry questions and solved its problem amazingly in just 16 seconds. Its solution involved a creative approach that surprised even human experts, highlighting AI’s ability to think outside the box.
“There have been legendary examples of [computer-aided] “Longer proofs than Wikipedia. This was not it: we are talking about a very short, human-style result,” added Professor Gowers.
Achievements and limitations
While the AI excelled in some areas, it struggled in others. On two of the six questions, the systems failed to make any progress. Additionally, Google’s AI systems took varying amounts of time to solve the problems, ranging from a few minutes to three days.
For reference, while human competitors have a nine-hour time limit, DeepMind’s AI took three days to solve a particularly difficult problem.
Professor Gowers, while acknowledging that this achievement goes “well beyond what automatic theorem provers could do before”, stressed several important reservations.
“The main problem is that the program took much longer than the human competitors,” he said. “If the human competitors had been given that kind of time to solve a problem, they would undoubtedly have performed better.”
He also highlighted the human involvement in translating problems into formal language.
“Are we close to the point where mathematicians will be redundant? It’s hard to say. I think we’re still a breakthrough or two away from that level,” Gowers concludes.
Future potential
Despite the limitations, Google DeepMind’s achievement represents a significant advance in AI’s mathematical reasoning capabilities.
Developing AI systems capable of solving complex mathematical problems could have far-reaching implications across fields from scientific research to education.
BULLETIN
The Daily Plan
Stay up-to-date on engineering, technology, space and science news with The Blueprint.
ABOUT THE PUBLISHER
Aman Tripathi An active and versatile journalist and editor. He has covered regular and breaking news for several leading publications and news media including The Hindu, Economic Times, Tomorrow Makers and many more. Aman has expertise in politics, travel and technology news, especially AI, advanced algorithms and blockchain, with a strong curiosity for all things science and technology.
0 Comments