OpenAI's GPT-5 Plagued by Worsening Hallucinations Despite Advances

In the rapidly evolving world of artificial intelligence, OpenAI’s latest model, GPT-5, was heralded as a breakthrough in reasoning and accuracy when it launched in August 2025. Yet, just months later, users and experts are grappling with persistent issues that echo the shortcomings of its predecessors. Reports of factual errors, or “hallucinations,” continue to plague the system, raising questions about whether the hype surrounding GPT-5 has outpaced its actual capabilities. Industry insiders point to real-world examples where the AI’s mistakes could lead to significant consequences, from misguided business decisions to personal embarrassments.

One such incident, detailed in a recent article by Digital Trends, highlights a traveler who nearly made an “embarrassing error” based on ChatGPT’s advice during a trip. The piece, published on October 7, 2025, underscores how even advanced models like GPT-5 can falter on seemingly straightforward queries, such as travel logistics or basic facts, leading to outputs that are confidently incorrect.

Persistent Hallucinations in Advanced Models

Internal evaluations from OpenAI itself have revealed that newer iterations, including those powering GPT-5, hallucinate more frequently than older versions—a counterintuitive finding that has puzzled developers. According to a study referenced in another Digital Trends report from April 2025, models like o3 and o4-mini exhibit significantly higher rates of fabricating information, with no clear explanation from the company. This trend suggests that as AI systems grow more complex, their propensity for errors doesn’t necessarily diminish; instead, it evolves into subtler, harder-to-detect forms.

Researchers and users alike have documented these issues across platforms. For instance, a Reddit thread on r/ChatGPTPro, dated September 4, 2025, amassed over 200 votes complaining that GPT-5 gets “basic facts wrong more than half the time,” forcing professionals to revert to traditional sources like Google or Wikipedia for reliability. Such anecdotes align with broader sentiment on social media, where posts on X (formerly Twitter) from August 2025 describe GPT-5 as “unreliable” and prone to “confidently wrong” responses, eroding trust in AI for critical tasks.

Root Causes and OpenAI’s Response

Delving deeper, a new OpenAI study, as covered by Tom’s Guide two weeks ago, attributes hallucinations to the model’s tendency to “guess” rather than admit uncertainty. The paper explains that GPT-5 often fabricates details to maintain conversational flow, a design choice that prioritizes user experience over strict accuracy. Sam Altman, OpenAI’s CEO, has defended the model, calling it a “big leap toward real scientific AI” in an India Today interview on October 6, 2025, while dismissing critics who highlight unmet promises and technical glitches.

However, not all assessments are damning. Tests reported by TechRadar in August 2025 show GPT-5 hallucinating less than its predecessor GPT-4o on certain benchmarks, though it still lags behind competitors like Google’s Gemini in consistency. Industry experts argue that these improvements come at a cost: increased latency and computational demands, as evidenced by medical evaluations where GPT-5’s accuracy stagnated despite higher “reasoning effort” modes, per posts from AI researchers on X.

Implications for Industry Adoption

For businesses integrating AI into workflows, these reliability issues pose substantial risks. A Unite.AI study from last week found GPT-5 hallucinating in 40% of newsroom-style queries, inventing unsubstantiated claims that could mislead journalists or analysts. This has led some, like the author of a Digital Trends piece from August 2025, to cancel subscriptions after years of use, citing diminishing returns.

OpenAI has acknowledged the problem but shows reluctance to fully eliminate hallucinations. As noted in a Futura-Sciences article four days ago, the company has identified a potential fix but deems it impractical, as it might “kill” ChatGPT’s engaging qualities. A Live Science report from last week echoes this, warning that resolving the issue could render the model too cautious and less useful for creative tasks.

Looking Ahead: Balancing Innovation and Trust

As AI continues to permeate sectors like healthcare and finance, the debate over GPT-5’s flaws intensifies. User feedback on platforms like Reddit, including a July 2025 post on r/ChatGPT with over 700 votes lamenting worsening hallucinations in research summaries, suggests that without targeted improvements, adoption may stall. Prompt engineering guides from OpenAI, as shared in X posts, offer workarounds like refining inputs to reduce errors, but they place the onus on users rather than the technology.

Ultimately, while GPT-5 represents progress, its ongoing mistakes remind us that true AI reliability remains elusive. Industry insiders must weigh these limitations against the model’s strengths, pushing for advancements that prioritize veracity without sacrificing versatility. As one X user poignantly noted in August 2025, trust erodes not from slowness, but from confident inaccuracies— a challenge OpenAI must address to maintain its lead.

OpenAI’s GPT-5 Plagued by Worsening Hallucinations Despite Advances

Notice an error?

Ready to get started?

WebProNews is a leading publisher of business and technology email newsletters and websites.