×
Google DeepMind launches Gemini 2.5 Computer Use to control web browsers
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Google DeepMind has launched Gemini 2.5 Computer Use, an AI model that can autonomously navigate web browsers by clicking, typing, and scrolling through websites like a human user. The model joins similar offerings from OpenAI and Anthropic in the emerging field of AI agents capable of performing web-based tasks with minimal human oversight.

What you should know: Gemini 2.5 Computer Use operates through natural language prompts and can execute complex multi-step web tasks independently.

  • Users simply provide instructions like “Open Wikipedia, search for ‘Atlantis,’ and summarize the history of the myth in Western thought,” and the model handles the entire process autonomously.
  • The AI takes screenshots of web pages to analyze user interfaces, then performs requested actions step-by-step while explaining its reasoning in a visible text box.
  • For sensitive tasks like making purchases, the model will ask for user confirmation before proceeding.

How it works: The model uses an iterative looping function that builds context from previous actions within a particular interface.

  • As it performs more tasks on a specific website, it accumulates more contextual understanding, leading to increasingly seamless functionality.
  • Google demonstrated the technology through sped-up videos showing the model updating customer relationship management systems and rearranging notes on Google’s discontinued Jamboard platform.

Performance benchmarks: Google claims Gemini 2.5 Computer Use outperformed competing tools from Anthropic and OpenAI across multiple evaluation metrics.

  • The model demonstrated superior accuracy and latency performance across “multiple web and mobile control benchmarks,” including Online-Mind2Web, an evaluation framework specifically designed for testing web-browsing agents.
  • While primarily designed for web browsers, Google noted the model also shows “strong promise” on mobile platforms.

Availability and access: The model is currently available through multiple channels for developers and researchers.

  • Access is provided through the Gemini API in Google AI and through Vertex AI for enterprise users.
  • A demo version is also accessible via Browserbase for those wanting to test the technology.

Safety considerations: Google has implemented multiple safeguards to prevent misuse and unintended consequences.

  • Developers can configure safety controls to prevent the model from bypassing CAPTCHAs, compromising data security, or gaining control of medical devices.
  • The system can be programmed to request user confirmation before performing specified sensitive actions.

Known limitations: Google acknowledges the model inherits fundamental weaknesses from its underlying Gemini 2.5 Pro foundation.

  • The company’s system card notes the model “may exhibit some of the general limitations of foundation models…such as hallucinations, and limitations around causal understanding, complex logical deduction, and counterfactual reasoning.”
  • These limitations reflect broader challenges across frontier AI models, as highlighted by recent Anthropic research showing AI systems often misinterpret harmless information as potentially unethical or illegal.

Competitive landscape: This launch positions Google alongside other major AI companies developing autonomous web navigation capabilities.

  • The release follows similar computer use models from OpenAI and Anthropic, indicating growing industry focus on AI agents that can operate independently across digital environments.
  • Google previously experimented with Project Mariner, a Chrome extension with similar web automation capabilities, suggesting sustained investment in this technology area.
Google's new Gemini 2.5 Computer Use model can click, type, and scroll

Recent News

Record labels sue AI music generators for $B in copyright damages

Neural fingerprinting can identify AI-created music by detecting its creative DNA, even when transformed.

Apple searching for replacement for AI chief John Giannandrea

The move signals Apple's growing frustration with Siri's performance against competitors.

Nightfood acquires $52.8M hotel as testbed for hospitality robots

Hotels become living laboratories for AI-powered service robots amid staffing challenges.