Why ChatGPT 'hallucinates'? OpenAI blames testing methods

OpenAI says AI “hallucinations” happen not because chatbots lie, but due to flawed testing methods that reward guessing over honesty. The company argues future benchmarks must penalize confident mistakes and reward uncertainty to make AI more reliable.

By Storyboard18| Sep 8, 2025 12:19 PM

Why ChatGPT 'hallucinates'? OpenAI blames testing methods

“Instead of rewarding only accuracy, tests should penalize confident mistakes more than honest admissions of uncertainty,” OpenAI suggested in its latest research.

When ChatGPT or similar tools make up facts with confidence, it’s not because they’re “lying” but because of how they’ve been trained and tested, OpenAI has revealed. The company says fixing artificial intelligence hallucinations may require rethinking how AI performance is measured, not just how models are built.

Hallucinations, in AI terms, occur when a chatbot generates answers that sound convincing but are factually incorrect. In one example, researchers found the system invented details about a scientist’s PhD dissertation and even gave the wrong birthday. The problem, OpenAI argues, comes less from flawed memory and more from incentives baked into evaluation.

Most current benchmarks reward a correct answer but treat an “I don’t know” response as failure. This encourages models to “guess,” much like students taking a multiple-choice test. Over time, AI learns that sounding confident—even when wrong, is better than admitting uncertainty.

“Instead of rewarding only accuracy, tests should penalize confident mistakes more than honest admissions of uncertainty,” OpenAI suggested in its latest research. In short, honesty should count more than bold but wrong answers.

The way large models are trained also plays a role. They learn by predicting the “next word” in billions of sentences, which works well for grammar and common facts but breaks down for rare or specific details, such as birthdays or niche research topics.

Interestingly, OpenAI noted that smaller models sometimes manage uncertainty better, avoiding risky guesses compared to their larger counterparts. This shows hallucinations are not an unfixable glitch but a matter of designing better guardrails.

How it Works

From a village in Hisar to India’s highest bench: Justice Surya Kant nominated as next CJI

Born on February 10, 1962, in a small village in Hisar district, Justice Surya Kant’s rise — from village schooling to the Supreme Court — is notable for its steady progression through Haryana’s legal ecosystem.

Brand Makers

Royal Enfield’s EV chief Mario Alvisi steps down amid strategic shift: Report

Alvisi, a seasoned auto strategist with stints at Ducati, Abarth and Alfa Romeo, joined Royal Enfield in 2023 to build its EV business from scratch.

Digital

India becomes Canva’s fourth-largest market as users create billions of designs

Canva has also introduced hyper-local content tailored to cultural and regional needs.

Brand Marketing

Amazon India crosses $20 billion export milestone; sets $80 billion goal by 2030 amid U.S. tariff headwinds

Amazon’s India export engine has hit a major milestone — helping local sellers surpass $20 billion in cumulative exports, including nearly $7 billion in 2025 alone

GM said it “recognises the efforts and accomplishments” of the affected employees and thanked them for their contributions.

Brand Marketing

General Motors lays off hundreds of engineers over Microsoft Teams

Social Media

How a cold email to Nithin Kamath changed this entrepreneur’s life - Details Here

Bengaluru-based entrepreneur Dilip Kumar recently shared on X (formerly Twitter) how a single email he sent in 2018 opened the door to a professional relationship with Zerodha co-founder Nithin Kamath — and ultimately changed his career trajectory. (Image: GQ India)

Brand Makers

'We have to build like Jio': Lenskart CEO Peyush Bansal on IPO vision, long-term growth

Peyush Bansal, Co-founder and CEO of Lenskart

Digital

Instagram users can now edit Stories using AI text prompts directly in the app

Until now, Meta’s AI editing capabilities were only available via its chatbot

The Google Gemini feature is currently being rolled out to Gemini users, including both personal and Workspace accounts.

Digital

Google Gemini now turns documents into ready-to-use presentations

Brand Makers

Flipkart Minutes head Kabeer Biswas quits after 10-month stint

Biswas had joined Flipkart after stepping down from Dunzo, the Reliance-backed quick commerce startup he co-founded in 2015.

Digital

Instagram introduces Watch History for Reels to help users find lost videos

Users will also be able to filter their history by date — from oldest to newest or vice versa.

Brand Marketing

Heritage Foods acquires 51% stake in health dessert brand Get-A-Way

Heritage Foods picks up 51% stake in Get-A-Way, retains option for further 20% by 2026.

Google’s core advertising business has been experiencing slower growth as competition intensifies from AI-driven ad platforms

How it Works

Google restructures ad sales unit, cuts middle management amid AI overhaul

Digital

Madras HC recognises cryptocurrency as property, says it can be held in trust

“There is no doubt that cryptocurrency is a property. It is not tangible, nor is it a currency. However, it is a property that can be enjoyed and possessed in a beneficial form and is capable of being held in trust,” the court said.

Digital

Microsoft Teams to get location-tracking feature, bosses to know when staff are in office

The rollout is expected to begin in December 2025.

How it Works

Consumer Affair Ministry tightens Legal Metrology norms: higher fees, wider scope for govt-approved test centres

Regional Reference Standard Laboratories and National Test House laboratories functioning under the Department of Consumer Affairs will now be deemed Government Approved Test Centres (GATCs).

Advertising

Apple reportedly set to roll out ads in Maps app by next year

Brand Makers

Bajaj's emotional farewell to iconic adman Piyush Pandey; says “Well Played, Piyushbhai”

The tribute — featuring the line, “The daring innings of the world’s favourite adman will live on — through the Chetak, and through every member of the Hamara Bajaj family that he inspired to not just ride, but roar” — is a nod to Pandey’s unforgettable contribution to Indian advertising and his long-standing creative association with Bajaj.

How it Works

Tata Consumer faces distributor revolt: AICPDF plans ‘Chalo Mumbai’ protest on November 4

Tata Consumer Products distributors' revolt intensifies against the company; plans protest on Nov 4 in Mumbai.

Digital

Jaguar Land Rover cyberattack dealt £1.9 billion blow to UK economy, report finds

Analysts estimated that the company was losing approximately £50 million each week during the shutdown.

This upcoming festive season, create perfect portraits of yourself or with your family with this tool on Chhath Pooja 2025.

Digital

Nano Banana prompts for Chhath Pooja; Create stunning AI photos of you with family to post on social media

Brand Makers

John Nitti steps down as X's global advertising head after 10 months

Nitti’s departure comes as X’s advertising division grapples with internal instability and external skepticism

How it Works

TRP guidelines to go through multiple rounds of consultation: Ashwini Vaishnaw

The ministry is also exploring ways to improve the television rating system to ensure fair revenue to television channels from government advertisements.

Brand Makers

Netflix co-founder Marc Randolph reveals toughest call that saved Netflix

“When Reed asked me to step down as CEO, it was tough but likely one of the best decisions,” Randolph said. “I loved the chaos of building from scratch, but Reed thrived on structure and discipline.”

The job loss came as a shock to Friend, who had planned to remain with the company until his 65th birthday — a milestone that would have allowed most of his stock grants to continue vesting even after retirement.

Brand Marketing

Laid off after 20 years at Microsoft, ex-manager says job loss derailed retirement plans

How it Works

SBI to hire 3,500 officers, aims for 30% women workforce in five years

SBI has launched several women-centric initiatives. (Image source: Moneycontrol)

Digital

OpenAI developing new generative music tool capable of creating compositions from text and audio prompts

Although OpenAI has previously developed generative music models, those efforts predate the introduction of ChatGPT.

Brand Makers

#SocialMediaResponsibly: Is your child’s ‘best friend’ a bot? The critical mental health debate over AI companions

According to officials, the upcoming rate hike is part of a broader effort to strengthen the government’s communication ecosystem and improve coordination across media units.

Advertising

Centre plans 26% hike in Print Ad rates after Bihar polls, TV revision to follow next

Digital

Ad agencies redefine creativity under DPDPA; witness 15% spike in compliance expenditure

DPDPA compliance is encouraging ad-tech and martech agencies to strike a balance between privacy and personalization. Those who get it right will earn not just clicks, but consumer confidence, say experts.(Representative Image: Erik Mclean via Unsplash)

Why ChatGPT 'hallucinates'? OpenAI blames testing methods

OpenAI says AI “hallucinations” happen not because chatbots lie, but due to flawed testing methods that reward guessing over honesty. The company argues future benchmarks must penalize confident mistakes and reward uncertainty to make AI more reliable.

More from Storyboard18

How it Works

From a village in Hisar to India’s highest bench: Justice Surya Kant nominated as next CJI

Brand Makers

Royal Enfield’s EV chief Mario Alvisi steps down amid strategic shift: Report

Digital

India becomes Canva’s fourth-largest market as users create billions of designs

Brand Marketing

Amazon India crosses $20 billion export milestone; sets $80 billion goal by 2030 amid U.S. tariff headwinds

Brand Marketing

General Motors lays off hundreds of engineers over Microsoft Teams

Social Media

How a cold email to Nithin Kamath changed this entrepreneur’s life - Details Here

Brand Makers

'We have to build like Jio': Lenskart CEO Peyush Bansal on IPO vision, long-term growth

Digital

Instagram users can now edit Stories using AI text prompts directly in the app

Digital

Google Gemini now turns documents into ready-to-use presentations

Brand Makers

Flipkart Minutes head Kabeer Biswas quits after 10-month stint

Digital

Instagram introduces Watch History for Reels to help users find lost videos

Brand Marketing

Heritage Foods acquires 51% stake in health dessert brand Get-A-Way

How it Works

Google restructures ad sales unit, cuts middle management amid AI overhaul

Digital

Madras HC recognises cryptocurrency as property, says it can be held in trust

Digital

Microsoft Teams to get location-tracking feature, bosses to know when staff are in office

How it Works

Consumer Affair Ministry tightens Legal Metrology norms: higher fees, wider scope for govt-approved test centres

Advertising

Apple reportedly set to roll out ads in Maps app by next year

Brand Makers

Bajaj's emotional farewell to iconic adman Piyush Pandey; says “Well Played, Piyushbhai”

How it Works

Tata Consumer faces distributor revolt: AICPDF plans ‘Chalo Mumbai’ protest on November 4

Digital

Jaguar Land Rover cyberattack dealt £1.9 billion blow to UK economy, report finds

Digital

Nano Banana prompts for Chhath Pooja; Create stunning AI photos of you with family to post on social media

Brand Makers

John Nitti steps down as X's global advertising head after 10 months

How it Works

TRP guidelines to go through multiple rounds of consultation: Ashwini Vaishnaw

Brand Makers

Netflix co-founder Marc Randolph reveals toughest call that saved Netflix

Brand Marketing

Laid off after 20 years at Microsoft, ex-manager says job loss derailed retirement plans

How it Works

SBI to hire 3,500 officers, aims for 30% women workforce in five years

Digital

OpenAI developing new generative music tool capable of creating compositions from text and audio prompts

Brand Makers

#SocialMediaResponsibly: Is your child’s ‘best friend’ a bot? The critical mental health debate over AI companions

Advertising

Centre plans 26% hike in Print Ad rates after Bihar polls, TV revision to follow next

Digital

Ad agencies redefine creativity under DPDPA; witness 15% spike in compliance expenditure