×
Uh-oh, Google trains search AI using web content despite opt-outs
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Google‘s latest court testimony reveals a significant loophole in its AI training opt-out system, potentially undermining publisher control over how their content is used. This disclosure highlights growing tensions between tech giants and content creators as AI systems increasingly rely on web content for training while offering inconsistent protections for publishers trying to maintain rights over their intellectual property.

The big picture: Google’s AI training controls allow publishers to opt out of having their content used for AI development, but this protection only applies to Google DeepMind‘s work, not other AI products within the company.

Key details: Eli Collins, a Google DeepMind vice president, testified in court that Google can train its search-specific AI products like AI Overviews using content from publishers who have explicitly opted out of AI training.

  • The testimony clarifies that Google’s opt-out controls only cover work done by Google DeepMind, the company’s AI research lab.
  • This creates a significant distinction between how different Google divisions handle publisher content preferences.

Why this matters: This revelation exposes a critical gap between what publishers believe they’re protecting when they opt out of AI training and what’s actually protected under Google’s current system.

  • Publishers seeking to prevent their content from being used to train AI may not realize their work could still be incorporated into Google’s search-specific AI products.
  • The distinction between different Google divisions creates a complex landscape for content creators trying to control how their intellectual property is used.

Reading between the lines: Google’s internal organizational boundaries are creating policy inconsistencies that could undermine trust with publishers and potentially attract regulatory scrutiny.

  • By maintaining separate policies for different AI initiatives within the company, Google effectively circumvents publisher preferences through organizational structure rather than technical limitations.
Google Can Train Search AI With Web Content Even After Opt-Out

Recent News

RealtimeVoiceChat enables natural AI conversations on GitHub

The open-source project integrates speech recognition, language models, and text-to-speech systems to enable interruptible, low-latency AI voice conversations that mimic natural human dialogue patterns.

RL impact on LLM reasoning capacity questioned in new study

Study finds reinforcement learning in LLMs narrows reasoning pathways rather than creating new reasoning capabilities.

Google AI scrapes blocked sites, raising privacy concerns

Google exploits policy loophole to train AI on opted-out websites by allowing DeepMind to respect blocks while other company divisions still use the same data.