Why Reddit is suing Perplexity and others over industrial-scale content scraping?

Content scraping is the process of using automated tools to extract data from websites, often in bulk.

By Storyboard18| Oct 23, 2025 1:36 PM

Content scraping is the process of using automated tools to extract data from websites, often in bulk.

Reddit has taken legal action against artificial intelligence startup Perplexity AI, along with three other companies — Oxylabs, AWMProxy, and SerpApi — accusing them of unauthorised data scraping to train AI systems. Filed in a New York federal court, the lawsuit alleges that these companies bypassed Reddit’s protections to extract vast amounts of user-generated content from thousands of subreddit communities, which Reddit describes as part of an “industrial-scale data laundering economy.” This case highlights the tension between AI innovation and content ownership.

Reddit argues that the company’s content is highly sought after by AI developers because it provides quality human-generated text, critical for training AI models like Perplexity’s answer engine. According to Reddit’s chief legal officer, Ben Lee, this “arms race” for content has fuelled a market where data is harvested on a massive scale, often without the consent of content creators.

This case is the latest in a growing wave of disputes between content owners and AI developers. Reddit had previously filed a similar lawsuit against Anthropic, which remains ongoing. In Perplexity’s defence, the startup claims its methods are principled and responsible, rejecting allegations of unlawful scraping and asserting its commitment to openness and factual AI responses.

What is content scraping?

Content scraping is the process of using automated tools to extract data from websites, often in bulk. While it can have legitimate applications — such as price comparison or search indexing — it can also be exploited to steal content, violate copyrights, or harvest personal information. For platforms like Reddit, scraping can divert traffic, harm server performance, and undermine the rights of creators whose posts are used without permission.

Reddit’s lawsuit claims that Perplexity, despite receiving a cease-and-desist notice last year, amplified its use of Reddit content, citing it forty times more frequently in AI-generated responses. Reddit is seeking monetary damages and a court order preventing Perplexity from using its data, raising important questions about how AI companies source human-generated content responsibly.

As AI firms race to improve models, platforms like Reddit are increasingly taking a stand to protect intellectual property and the communities that create the data powering these technologies. The outcome of this lawsuit could set a precedent for how AI developers access publicly available content and respect copyright in the AI era.

SPOTLIGHT

How it Works The new face of the browser: Who’s building AI-first browsers, what they do and how they could upend advertising

The Grand Irony: Agencies That Built Brands, Forgot to Build For Themselves

Despite being the original architects of global brands, advertising holding companies are collapsing in market value because they still sell human hours while the world now rewards scalable, self-learning systems.

Why Reddit is suing Perplexity and others over industrial-scale content scraping?

Content scraping is the process of using automated tools to extract data from websites, often in bulk.

SPOTLIGHT

The Grand Irony: Agencies That Built Brands, Forgot to Build For Themselves

POPULAR

More from Storyboard18

Brand Makers

Fintech firm BharatPe appoints Ajit Kumar as Chief Technology Officer

Digital

Today in AI | OpenAI under fire | Australia issues notice to AI chatbots | Snapchat's new AI lens

Special Coverage

Bhai Dooj boosts economy by ₹22,000 cr | DoT moves to curb fraud | BigBasket sees surge in electronic sales

Digital

Australia issues notices to four AI chatbot firms over child safety concerns

How it Works

Bhai Dooj delivers ₹22,000 cr economic boost, Swadeshi goods see 50% spike: CAIT

Advertising

Colgate pares marketing outlay in Q2; ad spend drops to Rs 225 crore

How it Works

DoT notifies Telecom Cybersecurity Amendment Rules to curb fraud

Digital

Australia issues notices to four AI chatbot firms over child safety concerns