×
AI search tools fail on 60% of news queries, Perplexity best performer
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Imagine a vivid dream…of a fake URL.

Generative AI search tools are proving to be alarmingly unreliable for news queries, according to comprehensive new research from Columbia Journalism Review’s Tow Center. As roughly 25% of Americans now turn to AI models instead of traditional search engines, the implications of these tools delivering incorrect information more than 60% of the time raises significant concerns about public access to accurate information and the unintended consequences for both news publishers and information consumers.

The big picture: A new Columbia Journalism Review study found generative AI search tools incorrectly answer over 60 percent of news-related queries, raising serious concerns as Americans increasingly adopt these tools as search engine alternatives.

Key details: Researchers tested eight AI-driven search tools by asking them to identify headline, publisher, publication date, and URL from direct news article excerpts.

  • The study included 1,600 queries across eight different generative search platforms.
  • Instead of declining to respond when information was unavailable, most tools confabulated plausible-sounding but incorrect answers.

Important stats: Error rates varied dramatically among the platforms tested in the research.

  • Perplexity provided incorrect information in 37 percent of queries.
  • ChatGPT Search incorrectly identified 67 percent (134 out of 200) of articles.
  • Grok 3 demonstrated the worst performance, with a 94 percent error rate.

Behind the numbers: Surprisingly, premium paid versions of these AI search tools often performed worse than their free counterparts.

  • Perplexity Pro ($20/month) and Grok 3’s premium service ($40/month) delivered incorrect responses more confidently than free versions.
  • More than half of citations from Google’s Gemini and Grok 3 led to fabricated or broken URLs.

Why this matters: The research identified serious technical violations that could impact both publishers and information consumers.

  • Evidence suggests some AI tools ignored Robot Exclusion Protocol settings, which publishers use to prevent unauthorized access.
  • URL fabrication was common, creating an illusion of credibility through citations that don’t actually exist.
AI search engines give incorrect answers at an alarming 60% rate, study says

Recent News

AI as its own therapist: The rise of hyper-introspective systems

AI systems that can analyze and modify their own thought processes may offer new solutions to alignment challenges while introducing risks beyond human oversight.

DeepMind UK staff seek unionization amid Israel deal concerns

DeepMind's London workforce seeks union representation over concerns about AI technology being sold to Israeli defense organizations.

AI powers Husqvarna’s smart factory transformation

The Swedish manufacturer integrates generative AI to help factory technicians diagnose equipment problems and reduce costly downtime by accessing previously siloed knowledge across departments.