Here are top 10 AI models that surpass ChatGPT, and the rankings are unexpected

Prolific did not offer specific explanations for ChatGPT’s comparatively lower ranking.

By Storyboard18| Nov 25, 2025 5:19 PM

Prolific did not offer specific explanations for ChatGPT’s comparatively lower ranking.

OpenAI’s ChatGPT, the chatbot that ignited the global boom in generative AI when it launched in late 2022, has long dominated public awareness despite the rise of strong rivals including Google’s Gemini suite, xAI’s Grok, Anthropic’s Claude, Qwen, DeepSeek and Mistral. But a new study suggests that the landscape has shifted considerably.

A benchmarking assessment conducted by UK-based company Prolific has ranked ChatGPT-4.1 only eighth among leading AI models. The study uses a proprietary benchmark known as “Humaine”, which the company describes as a framework designed to evaluate AI systems through the lens of natural human interaction, rather than the highly technical datasets and reasoning tasks favoured by researchers.

According to Prolific, traditional evaluation methods often fail to reflect real-world user needs. The company noted in a blog post, arguing that this mismatch has created a “disconnect between what gets optimised for and what people actually value." The company also highlighted that current evaluation is heavily skewed towards metrics that are meaningful to researchers but opaque to everyday users.

As per a Mint report, the company also criticised other preference-based rankings, saying platforms that rely on open voting can be subject to sample bias and disproportionately attract tech-savvy users. To counter this, Humaine incorporates automated quality monitoring to ensure participants provide thoughtful, consistent assessments.

Top 10 AI Models According to the Humaine Benchmark

Prolific’s study, published in September, produced the following ranking:

1. Gemini 2.5 Pro (Google)

2. DeepSeek v3 (DeepSeek)

3. Magistral Medium (Mistral)

4. Grok 4 (xAI)

5. Grok 3 (xAI)

6. Gemini 2.5 Flash (Google)

7. DeepSeek R1 (DeepSeek)

8. ChatGPT-4.1 (OpenAI)

9. Gemma (Google)

10. Gemini 2.0 Flash (Google)

The timing of the study is notable: it predates the release of Google’s Gemini 3 Pro and xAI’s Grok 4.1 and Grok 4.1 Thinking, meaning the leaderboard may look different if reassessed today.

What the Results Suggest

Gemini 2.5 Pro topping the list is unsurprising, given its strong performance across multiple benchmarks since launch. However, the absence of an OpenAI model from the top five — and ChatGPT ranking below competitors such as DeepSeek, Grok and Mistral — marks a striking shift in perceived capability.

Prolific did not offer specific explanations for ChatGPT’s comparatively lower ranking. However, it emphasised that Gemini 2.5 Pro consistently emerged as the strongest system across the “Overall Winner” metric, a key indicator in its evaluation framework.

As AI competition accelerates, the Humaine rankings reflect a rapidly evolving market in which user-centric performance — rather than technical supremacy alone — may increasingly shape which models lead the field.

SPOTLIGHT

Special Coverage Calling India’s Boldest Brand Makers: Entries Open for the Storyboard18 Awards for Creativity

From purpose-driven work and narrative-rich brand films to AI-enabled ideas and creator-led collaborations, the awards reflect the full spectrum of modern creativity.

“Confusion creates opportunity for agile players,” Sir Martin Sorrell on industry consolidation

Looking ahead to the close of 2025 and into 2026, Sorrell sees technology platforms as the clear winners. He described them as “nation states in their own right”, with market capitalisations that exceed the GDPs of many countries.

Here are top 10 AI models that surpass ChatGPT, and the rankings are unexpected

Prolific did not offer specific explanations for ChatGPT’s comparatively lower ranking.

Top 10 AI Models According to the Humaine Benchmark

SPOTLIGHT

“Confusion creates opportunity for agile players,” Sir Martin Sorrell on industry consolidation

POPULAR

More from Storyboard18

Television

Network18 outpaces industry despite weak ad market: MD Rahul Joshi

Special Coverage

EXCLUSIVE: TRAI urges MIB-led fix on 10+2 ad cap, warns against growth impact | CCI’s Deepak Anurag: Too early for cartel settlements

Digital

OpenAI warns investors Elon Musk may make “outlandish claims” as lawsuit heads to trial

Advertising

Odisha man files complaint against Salman Khan, Hrithik Roshan over allegedly misleading Mountain Dew ad

How it Works

Wipro Q3 FY26: Profit falls 7% to Rs 3,119 crore amid cost pressures; deal bookings stay strong

Advertising

Polycab Q3 FY26: Ad spends rise to Rs 91 crore, profit up 36%

Television

Panorama Studios inks long-term distribution pact with Century Films for Malayalam slate

Advertising

Odisha man files complaint against Salman Khan, Hrithik Roshan over allegedly misleading Mountain Dew ad