8 C
Canada
Wednesday, January 14, 2026
HomeCosmeticsReddit sues Perplexity over alleged information scraping for AI coaching 

Reddit sues Perplexity over alleged information scraping for AI coaching 


Reddit has initiated authorized proceedings towards AI firm Perplexity in New York federal courtroom, citing unauthorised scraping and use of user-generated content material from its platform for AI mannequin coaching.

The lawsuit additionally names Oxylabs, AWMProxy, and SerpApi, alleging that these corporations aided Perplexity’s information assortment by concealing their identities and utilizing methods designed to mimic typical person behaviour.


Entry deeper trade intelligence

Expertise unmatched readability with a single platform that mixes distinctive information, AI, and human experience.


Discover out extra



Perplexity, which has constructed an AI-driven search service, has rejected the allegations.

In an announcement posted on Reddit, the AI firm asserted it doesn’t practice fashions on the social media platform’s content material however as an alternative gives summaries and citations of public discussions. The corporate added that “it’s ‘not possible’ to signal a licence settlement” for that reason.

The assertion additional learn: “A yr in the past, after explaining this, Reddit insisted we pay anyway, regardless of lawfully accessing Reddit information. Bowing to robust arm ways simply isn’t how we do enterprise.”

Perplexity characterised the lawsuit as “a present of power in Reddit’s coaching information negotiations with Google and OpenAI.”

SerpApi said to CNBC that it “strongly disagrees” with Reddit’s claims and can defend itself in courtroom. CNBC didn’t obtain responses from Oxylabs or AWMProxy.

Reddit’s authorized motion is a part of a wider trade battle relating to using publicly accessible content material in coaching giant language fashions.

The corporate beforehand filed the same swimsuit towards Anthropic in June 2025. In line with the grievance, Perplexity elevated its referencing of Reddit content material by 40 instances following receipt of a cease-and-desist letter from the latter.

Reddit claims posts from its platform are a frequent supply for AI-generated responses on Perplexity’s service.

With greater than 100,000 communities, Reddit is a serious supply of publicly accessible person conversations.

Researchers have beforehand famous that Reddit’s quantity and moderation present a invaluable coaching dataset for producing extra conversational AI outputs.

The social media firm has pursued information licencing methods with enterprises comparable to OpenAI and Alphabet, limiting AI-related entry to its information to those that have paid for particular agreements.

Earlier in 2025, Reddit COO Jen Wong advised Adweek that AI licencing offers with Google and OpenAI account for near 10% of the agency’s income.

In an announcement offered to CNBC, Reddit chief authorized officer Ben Lee stated: “AI corporations are locked in an arms race for high quality human content material,” describing the method as fuelling an “industrial-scale ‘information laundering’ financial system.”




RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments