×
Wikipedia faces AI bot bandwidth crisis as scraping costs threaten site stability
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Wikipedia is experiencing a bandwidth crisis due to AI bot activity, with automated scraping operations dramatically increasing infrastructure costs and threatening site stability. This situation highlights the growing tension between open knowledge resources and AI companies’ data-gathering practices, raising important questions about sustainability and responsible access to publicly available information in the era of large AI models.

The big picture: Wikipedia’s infrastructure is buckling under unprecedented traffic from AI bots scraping content, with the nonprofit Wikimedia Foundation warning that automated requests have “grown exponentially.”

  • The foundation revealed that since January 2024, bandwidth used for downloading multimedia content has surged by 50%, primarily from automated programs harvesting openly licensed images for AI model training.
  • While human traffic spikes during high-interest events are manageable, the scale of bot scraping presents growing risks to site stability and significantly increases data center costs.

By the numbers: Bot traffic is disproportionately consuming Wikipedia’s resources compared to human visitors.

  • At least 65% of resource-consuming traffic comes from bots, despite bots representing only about 35% of total pageviews.
  • The scraping activity often targets less popular articles and even hits developer infrastructure, including code review platforms and bug trackers.

Steps being taken: The Wikimedia Foundation is implementing both immediate and long-term measures to address the unsustainable situation.

  • Site managers have already imposed case-by-case rate limiting for problematic AI crawlers, with some bots facing outright bans.
  • The foundation is developing a “Responsible Use of Infrastructure” plan that will likely require bot operators to authenticate for high-volume scraping and API use.

What they’re saying: The Wikimedia Foundation emphasized the financial reality of maintaining its services amid the AI scraping surge.

  • “Our content is free, our infrastructure is not: We need to act now to re-establish a healthy balance,” the foundation stated.

Looking ahead: Wikipedia is seeking community feedback on methods to identify AI bot traffic and filter access appropriately, indicating a shift toward more controlled access for automated systems.

Wikipedia Faces Flood of AI Bots That Are Eating Bandwidth, Raising Costs

Recent News

AI boosts SkinCeuticals sales with Appier’s marketing tech

Data-driven AI marketing tools helped L'Oréal achieve a 152% increase in ad spending returns and 48% revenue growth for SkinCeuticals' online store.

Two-way street: AI etiquette emerges as machines learn from human manners

Users increasingly rely on social niceties with AI assistants, reflecting our tendency to humanize technology despite knowing it lacks consciousness.

AI-driven FOMO stalls purchase decisions for smartphone consumers

Current AI smartphone features provide limited practical value for many users, especially retirees and those outside tech-focused professions, leaving consumers uncertain whether to upgrade functioning older devices.