×
Reddit sues AI companies as it monetizes data with $60M Google deal
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Reddit has emerged as a central battleground in the AI industry’s data acquisition wars, with the platform simultaneously monetizing its content through licensing deals while pursuing legal action against unauthorized scraping. The social media company secured a $60 million agreement with Google for AI training data and has filed lawsuits against Anthropic and data-scraping companies allegedly feeding content to Perplexity AI, highlighting the growing tension over digital content ownership in the age of artificial intelligence.

What you should know: Reddit is aggressively defending its data while selectively monetizing access to major tech companies.

  • Google paid $60 million for legitimate access to Reddit’s content for training large language models, establishing a precedent for paid data licensing.
  • Reddit sued Anthropic, a leading AI company, this summer for training on its content without permission, demonstrating the platform’s willingness to pursue unauthorized AI training.
  • The company filed another lawsuit this week against data-scraping companies allegedly stealing content and selling it to Perplexity AI, an AI-powered search engine.

The legal complexity: Reddit’s latest lawsuit reveals the intricate web of data flow in AI training, focusing on indirect scraping rather than direct platform access.

  • The lawsuit doesn’t allege that Reddit itself was scraped directly by the defendants.
  • Instead, the content allegedly came from Google search results that included short summaries of Reddit articles, which then found their way into Perplexity’s search results.
  • Perplexity denies any wrongdoing in the matter.

Why this matters: These cases could establish crucial precedents for how AI companies access and use internet content for training and search applications.

  • Reddit must prove that Perplexity circumvented copyright protections by purchasing scraped content and that Reddit was harmed in the process.
  • The platform faces the challenge of explaining why Google’s free use of Reddit summaries in search results is acceptable, but Perplexity’s use is harmful.
  • Web scraping itself isn’t illegal, but using it to violate copyright protection could make scrapers liable.

The big picture: AI development is fundamentally disrupting established internet norms around content usage and fair dealing.

  • “Like all of these AI copyright lawsuits, a lot of it comes down to vibes,” the analysis notes.
  • “The internet agreed on behavioral norms. AI is now taking a sledgehammer to them.”
  • If Reddit’s case proceeds to discovery, the list of defendants could expand, potentially revealing more companies involved in unauthorized data acquisition.
Reddit’s data becomes a battleground in the AI gold rush

Recent News

Study finds feeding AI text images instead of tokens boosts memory 10x

Flipping the script by turning text into images first for massive efficiency gains.

AI app records doctor visits to help patients remember medical advice

Mirror was born from a son's fear of missing critical news about his father's Alzheimer's care.