ADVERTISEMENT
When ChatGPT or similar tools make up facts with confidence, it’s not because they’re “lying” but because of how they’ve been trained and tested, OpenAI has revealed. The company says fixing artificial intelligence hallucinations may require rethinking how AI performance is measured, not just how models are built.
Hallucinations, in AI terms, occur when a chatbot generates answers that sound convincing but are factually incorrect. In one example, researchers found the system invented details about a scientist’s PhD dissertation and even gave the wrong birthday. The problem, OpenAI argues, comes less from flawed memory and more from incentives baked into evaluation.
Most current benchmarks reward a correct answer but treat an “I don’t know” response as failure. This encourages models to “guess,” much like students taking a multiple-choice test. Over time, AI learns that sounding confident—even when wrong, is better than admitting uncertainty.
“Instead of rewarding only accuracy, tests should penalize confident mistakes more than honest admissions of uncertainty,” OpenAI suggested in its latest research. In short, honesty should count more than bold but wrong answers.
Read More: OpenAI’s boldest marketing bet in India: a likely ₹300 crore Ad war chest
The way large models are trained also plays a role. They learn by predicting the “next word” in billions of sentences, which works well for grammar and common facts but breaks down for rare or specific details, such as birthdays or niche research topics.
Interestingly, OpenAI noted that smaller models sometimes manage uncertainty better, avoiding risky guesses compared to their larger counterparts. This shows hallucinations are not an unfixable glitch but a matter of designing better guardrails.
Read More: AI data centres to drive global water usage 11-fold by 2028: Morgan Stanley
Read More: OpenAI-backed AI film ‘Critterz’ heads into production, targets Cannes premiere