In a bold assertion that has sent ripples through the artificial intelligence community, Sebastien Bubeck, a prominent AI researcher who recently transitioned from Microsoft to OpenAI, claimed on X (formerly Twitter) that an advanced model dubbed GPT-5-pro possesses the capability to generate novel mathematical proofs. In a thread posted on August 20, 2025, Bubeck described an experiment where he fed the model an open problem from a convex optimization paper, and it produced a improved bound that he verified as correct. This demonstration, if replicable on a broader scale, could mark a significant milestone in AI’s evolution toward human-like reasoning.
Bubeck, whose background includes leading generative AI efforts at Microsoft and developing the Phi series of efficient language models, detailed how GPT-5-pro tackled a specific query from a research paper on gradient descent in smooth convex optimization. The original work established that for step sizes below 1/L (where L denotes smoothness), the function value curve remains convex, but left a gap for larger steps up to 1.75/L. According to Bubeck’s account, the model advanced the bound to 1.5/L, a non-trivial improvement that he suggested would warrant its own arXiv note.
The Shift from Microsoft to OpenAI and Its Implications for AI Research
This revelation comes amid Bubeck’s high-profile move to OpenAI, as reported by The Information in October 2024, where he departed his role as vice president of generative AI research at Microsoft. At OpenAI, Bubeck has evidently gained access to cutting-edge prototypes like GPT-5-pro, which appears to build on the company’s o1 and o3 models known for excelling in reasoning benchmarks such as GPQA and MATH. His earlier posts on X highlight enthusiasm for these advancements, including o3’s near-90% performance on challenging tasks like the ARC-AGI benchmark.
Industry insiders view this as part of a broader push toward artificial general intelligence (AGI), with Bubeck previously musing on X about whether current scaling techniques could yield AGI by compressing vast datasets like the web into emergent “minds.” However, he tempered expectations in other threads, noting that AGI might not materialize solely through existing methods.
Verifying AI’s Mathematical Prowess and Ethical Considerations
Bubeck’s experiment wasn’t without caveats; he acknowledged that a subsequent version of the paper in question, available on arXiv, had already closed the gap entirely to 1.75/L with human contributions. Still, the model’s independent progress underscores potential for AI to assist in theoretical fields, echoing Bubeck’s own scholarly history in optimization and adversarial robustness, as detailed on his personal website sbubeck.com.
Critics, however, question the conclusiveness of such anecdotes. Posts on X from Bubeck emphasize that this was a targeted test, following another attempt where the model analyzed gradient flow trajectories without breakthroughs. As GeekWire noted in covering his move, Bubeck’s work on compact models like Phi-4— which rivals larger systems like Llama 3.3 with fewer parameters—suggests a focus on efficiency that could democratize advanced AI.
Broader Impacts on Innovation and Competition in Tech
The thread has sparked debates about AI’s role in scientific discovery, with Bubeck sharing that GPT-5-pro’s output was novel enough to stand alone, though he opted not to publish it due to the human-led update. This aligns with his past innovations, such as the Phi-1 model that achieved high coding benchmarks with minimal parameters, as he announced on X in 2023.
For tech giants, this highlights intensifying competition. OpenAI’s recruitment of talents like Bubeck, per reports from The Decoder, positions it to lead in reasoning-focused AI, potentially accelerating fields like mathematics and beyond. Yet, as Bubeck himself has posted, the path to AGI remains uncertain, demanding rigorous validation beyond isolated successes.
Looking Ahead: Challenges and Opportunities in AI-Driven Proofs
If models like GPT-5-pro can consistently contribute to open problems, it could transform research workflows, reducing the time humans spend on incremental proofs. Bubeck’s LinkedIn profile here underscores his foundational machine learning expertise, lending credibility to his claims.
Ultimately, this episode, detailed in Bubeck’s X thread at twitter.com/SebastienBubeck/status/1958198661139009862, serves as a tantalizing glimpse into AI’s potential, urging the industry to balance hype with empirical scrutiny as capabilities advance.