×
Do you see what I see? Google’s Gemini adds screen-aware AI to transform Android experience
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Google‘s Gemini is advancing the Android user experience with innovative screen-aware AI capabilities that essentially turn the assistant into an interactive visual companion. Scheduled to roll out to Gemini Advanced subscribers later this month, these features represent a significant shift in how users interact with their devices, moving beyond simple voice commands to contextual visual understanding. This evolution positions Gemini as a more intuitive assistant that can respond to what users see rather than just what they say.

The big picture: Google is enhancing Gemini with screen-sharing functionality that allows users to ask questions about content visible on their Android devices, mirroring capabilities already available on desktop versions.

  • The feature enables contextual interactions, such as asking for shoe recommendations while viewing a jacket image, creating a more natural assistance experience.
  • These capabilities are part of Google’s Project Astra, a broader initiative to develop multimodal AI that better perceives and understands its environment.

Key features: The upcoming Gemini update focuses on two major capabilities that expand how users can leverage AI assistance across applications.

  • Users can share their screens with Gemini to ask questions about displayed content, whether browsing websites, viewing images, or reading documents.
  • Real-time video interactions enable users to engage with Gemini about their surroundings by activating the camera within the app, similar to ChatGPT‘s Voice and Vision functionality.

Practical applications: Gemini’s new capabilities will integrate with popular apps to provide contextual assistance without disrupting the user experience.

  • While watching YouTube videos, users can activate Gemini to ask specific questions about content, such as inquiring about exercise techniques during fitness tutorials.
  • When viewing PDFs, the “Ask about this PDF” option will allow users to request summaries or clarifications, streamlining research and information processing on mobile devices.

Why this matters: By enabling Gemini to interpret and respond to visual inputs, Google is fundamentally changing how AI assistants function, creating more immersive and context-aware digital experiences.

  • The screen-aware capabilities transform passive viewing into interactive experiences, potentially setting new benchmarks for AI assistant functionality.
  • As these features reach Android users, they could significantly reduce the cognitive load of information processing by allowing the AI to assist with understanding and contextualizing on-screen content.
Forget ChatGPT — Google Gemini can now see the world with live video and screen-sharing

Recent News

Zuckerberg-backed school for low-income kids to close in 2026

Zuckerberg-backed school closure reveals the precarious nature of tech philanthropy in education as CZI shifts focus toward scientific research.

Consumer-company interactions to improve with AI, Zendesk CEO predicts

AI systems could handle 80% of customer service inquiries within five years, allowing human agents to focus on complex problems requiring emotional intelligence.

Hallucinations spike in OpenAI’s o3 and o4-mini

Despite AI advances, OpenAI's newer o3 and o4-mini models show higher hallucination rates than predecessors, creating a reliability paradox at a time when they're targeting more complex reasoning tasks.