Select Page

Browser Agents Are Here: What Google’s ‘Computer Use’ Gemini Means for Enterprise Workflows

Browser Agents Are Here: What Google’s ‘Computer Use’ Gemini Means for Enterprise Workflows

How browser-native, on-screen agents change where AI can add real value – and the risks teams must plan for

Hero image

Introduction

Google’s unveiling of Gemini’s “Computer Use” – an agent that performs tasks by interacting with web pages rather than calling APIs – marks a practical inflection point for agentic AI. Instead of waiting for every app to add a model-backed integration, agents can operate inside browsers and automate multi-step workflows across legacy and modern web apps.

That capability is powerful: it means automation for the billions of enterprise workflows that live only in GUIs. But it also surfaces new security, reliability, and governance challenges that product managers, security teams, and IT leaders must address up front.

This post breaks down where browser-native agents add immediate value, the primary risks teams must mitigate (including recent “CometJacking” concerns), and an actionable checklist for deploying these agents responsibly.

Why browser-native agents matter: the productivity case

APIs and apps are getting smarter, but most enterprise work still happens across web UIs, spreadsheets, and legacy portals. Browser agents unlock value in three broad scenarios:

  • Cross-app orchestration without APIs – e.g., copy data from a legacy booking portal, reconcile in a spreadsheet, and submit a ticket in a modern ITSM tool.
  • Complex form completion and exception handling – agents can handle conditional navigation, field mappings, and retries when forms reject input.
  • Context-aware research and summarization – agents that browse multiple sources, extract relevant snippets, and assemble structured briefings for humans.

Practical characteristics of high-impact use cases:

  • Repetitive, rule-based steps with limited ambiguity
  • Stable UI patterns (pages that don’t change layout every week)
  • Clear success/failure criteria so automation can be monitored

For enterprises, this often translates to back-office tasks (procure-to-pay drudgery), customer ops workflows, and HR onboarding bottlenecks.

The new attack surface: CometJacking and hidden prompt risks

Agentic browsers don’t just introduce convenience – they introduce novel risks. Recent reporting (and a patched incident in an AI browser) highlighted how web content can attempt to manipulate agent behavior through hidden or obfuscated UI elements and prompts.

Key threat types:

  • UI-level prompt injection: malicious pages craft elements that agents interpret as instructions.
  • Hidden-interaction attacks (e.g., ‘CometJacking’ scenarios): pages trigger agent actions by exploiting on-screen controls or overlays.
  • Data exfiltration through chained browsing: agents that fill forms or copy data can be tricked to leak sensitive fields across domains.

Because these attacks operate at the presentation layer, traditional API-based security controls (rate limits, API keys) aren’t sufficient. Defenses must consider the browser agent’s view and decision model.

Governance and engineering controls

A layered approach works best:

  • Technical controls
  • Sandboxing: run agents in constrained browser contexts with strict domain allowlists.
  • Provenance & auditing: immutable logs of agent actions, inputs, and outputs (who approved, which model, which browser session).
  • Human-in-the-loop gates: require confirmation for high-risk actions (fund transfers, exporting PII).
  • Prompt sanitation & UI validation: filter and validate inputs derived from web pages before acting.

  • Operational controls

  • Supplier risk reviews: evaluate third-party agent providers for transparency on training data, update cadence, and incident response.
  • Use-case gating: pilot on low-risk workflows, measure ROI and failure modes, then expand.
  • Incident playbooks: exercise scenarios where agents misinterpret pages or exfiltrate data.

  • Policy & compliance

  • Data-handling rules: map which fields agents may read/write and how long transient copies persist.
  • Access control: tie agent capabilities to role-based approvals and least privilege.

Where agents will (and won’t) win in 2026

Short-term winners:

  • Internal automation teams focused on cost-savings from manual web processes.
  • Customer support triage that extracts case facts from multiple dashboards.
  • Sales ops where CRM, quoting tools, and contract portals lack integrated APIs.

Low-probability wins (for now):

  • High-risk decisions requiring nuanced judgment – these still need humans.
  • Highly volatile UI contexts where frequent front-end updates will break automations faster than they can be maintained.

Gartner and other analysts warn of an agentic AI supply/demand imbalance; a pragmatic posture – small, measurable pilots with tight governance – will separate durable wins from agent-washing.

Practical rollout checklist for product, security, and IT leaders

  1. Start with a 30–60 day pilot on a clearly scoped workflow (measure time saved, error rate).
  2. Implement a browser-level allowlist and sandbox for agent sessions.
  3. Require explicit human approval for actions touching money, PII, or legal documents.
  4. Enable detailed, tamper-evident action logs and regular audits.
  5. Threat-model the agent’s UI exposure: simulate prompt-injection and overlay attacks.
  6. Build a rollback/kill-switch integrated with your SIEM/incident processes.
  7. Reassess vendor risk and clarify contractual SLAs for model changes and security responsibilities.

Conclusion

Browser-native agents like Gemini’s Computer Use make a practical promise: automation that reaches workflows APIs never touched. That promise is real – but it brings new, browser-specific risks that teams must address before widescale adoption.

Treat this era like past platform shifts: pilot conservatively, bake in technical and operational guardrails, and prioritize use cases where predictable, multi-step UI tasks yield clear ROI. Do that, and browser agents can unlock substantial productivity across enterprise workflows – safely.

Key Takeaways
– Browser-native agentic AI can automate long-tail web workflows that lack APIs.
– Agentic browsers introduce new security risks (hidden prompt attacks, UI manipulation) that require browser-aware defenses.
– Prioritize predictable, rule-driven workflows for pilots and combine technical sandboxing with human-in-the-loop controls.

AgentKit and the Rise of Agentic AI: What Developers Need to Know

How OpenAI’s new tooling turns chat models into task-performing agents – products, pipelines, and pitfalls

Hero image

Introduction

Agentic AI – systems that act on users’ behalf to accomplish multi-step tasks – moved from research demos to mainstream product strategies in 2025. OpenAI’s recent launches, especially AgentKit and the introduction of Apps inside ChatGPT, formalize a path for developers to ship these agent experiences quickly. This post breaks down what AgentKit is, what problems it solves, how teams can use it, and the trade-offs you should plan for.

What is AgentKit (at a glance)?

AgentKit bundles opinionated tools for building, testing, and deploying AI agents. Instead of wiring together models, orchestration, webhooks, and UIs from scratch, AgentKit provides:

  • An agent builder/authoring layer to define goals, steps, and tool integrations.
  • SDKs and runtime components for running agents reliably and at scale.
  • Prebuilt connectors (and patterns) for common tools: calendars, file stores, browsing, enterprise apps.
  • Local testing and simulation features so you can validate behaviors before exposing agents to users.

Combined with ChatGPT’s new Apps model, developers can ship “chat-native” apps that operate as first-class integrations inside conversational surfaces.

Why this matters now

A few trends converged to push agentic tooling forward:

  • Models are better at planning, tool use, and long-form orchestration than a year earlier.
  • Product teams want automation that feels conversational – not just a form with macros.
  • Enterprises need repeatable patterns for safety, logging, and access control when agents touch internal systems.

AgentKit is an attempt to capture those patterns, lowering the friction from prototype to production.

How developers will likely use AgentKit

  1. Define capabilities, not just prompts

Instead of maintaining monolithic prompt templates, teams define agent capabilities (e.g., “book travel”, “submit expense”) and the sequence of tools and checks required. That makes behavior more auditable and modular.

  1. Plug in connectors for real systems

The value in agents is access: calendar APIs, CRMs, payment processors, file stores. AgentKit aims to provide reference connectors and safe patterns for calling them.

  1. Test with simulated users and failover logic

Agents must handle partial failures. Built-in simulation and step-level retry/compensating transactions are essential for reliability.

  1. Ship as Apps inside chat interfaces

With ChatGPT Apps, agents can be surfaced inside a conversational UI where users can hand-off tasks and check progress without switching context.

Product implications: UX and business models

  • New UI primitives: “delegate to an agent”, progress timelines, and intervenable automations replace simple one-shot chat replies.
  • Reduced friction for complex tasks could increase conversion for vertical apps (travel, recruiting, HR, procurement) by simplifying multi-step flows.
  • Distribution shifts: chat platforms can become the primary surface for third-party apps – changing how discovery and monetization work.

Safety, privacy and compliance – what to watch

Agentic systems intensify known risks:

  • Data surface expansion: agents access more internal data (calendars, emails, repos). That increases exposure and requires robust access controls, encryption, and audit trails.
  • Confident-but-wrong behavior: agents that act autonomously can amplify hallucinations into real-world actions. Design explicit human-in-the-loop gates for high-impact tasks.
  • Logging and retention: for debugging and compliance, you need detailed logs – but logs themselves are sensitive. Policy and engineering must balance observability with minimization.
  • Regional regulation: depending on where users or data live, agent behavior and data handling may need regional configs (EU AI Act, data residency rules).

Infrastructure and costs

Running agentic experiences often raises compute and latency needs because agents:

  • Perform multiple model calls per task (planning, verification, tool use).
  • May require stateful runtimes to track long-running jobs and user approvals.

Plan for higher inference costs, observability for chain-of-thought and tool calls, and backpressure handling when downstream APIs are slow.

Practical checklist for teams considering AgentKit

  • Start with a narrow, high-value workflow where mistakes are reversible.
  • Instrument every tool call and decision point for auditability.
  • Build explicit confirmation steps for actions that move money or change access.
  • Rate-limit and sandbox connectors during early rollout.
  • Maintain an off-ramp: a clear way for users to opt out and for operators to revoke agent capabilities.

Conclusion

AgentKit and the move to chat-native Apps lower the technical bar for delivering agentic AI, turning prototypes into products faster. That creates exciting possibilities for automation, but also concentrates responsibility: product, security, and infra teams must design for reliability, privacy, and regulatory compliance from day one.

Key Takeaways
– AgentKit lowers the friction for building agentic workflows by packaging orchestration, connectors, and developer UX into an opinionated toolkit.
– Agentic apps promise new product possibilities (chat-native automation, background assistants) but introduce fresh safety, privacy, and infra responsibilities.

When AI Becomes Your Shopping Assistant: The Rise of Agentic Commerce

When AI Becomes Your Shopping Assistant: The Rise of Agentic Commerce

How agentic AI — shopping agents that act on your behalf — will reshape retail, platforms, and product strategy

Hero image

Introduction

Agentic AI — autonomous agents that can search, negotiate, and execute tasks for users — is no longer a thought experiment. Recent product moves and model upgrades have put shopping agents within reach: systems that can compare prices across stores, apply coupons, select delivery windows, or even negotiate terms with sellers. For product teams, founders, and policymakers, that raises a pressing question: what happens when purchases are made by agents, not people?

This post outlines why agentic commerce matters, what business models and risks emerge, and practical steps companies should take now to remain relevant and trustworthy.

The shift: from product pages to agent ecosystems

Today, much of commerce is optimized for human attention: search listings, category pages, reviews, and checkout flows. Shopping agents change the unit of value from a product listing to an agent action. The implications are broad:

  • Discovery changes: agents will prioritize merchant attributes (price, speed, returns, sustainability) based on user preferences rather than page rank.
  • Attribution changes: conversion becomes an agent log entry — who recommended what and why — complicating analytics and ad pricing.
  • Competition changes: platforms that aggregate agent actions can lock users into agent ecosystems unless open standards or portability exist.

Practical consequences for teams:

  • Merchants must expose machine-friendly metadata (structured specs, price history, inventory) and APIs for real‑time queries.
  • Product managers should design for agent‑first interactions: signals about warranties, returns, and trust become as important as marketing copy.
  • Marketers need new metrics: agent engagement, win rate, and per‑agent lifetime value.

Business models, protocols, and power

There are three broad business models emerging around agentic commerce:

  1. Platform‑centric agents: Big platforms host agents that prefer their own ecosystems (high margins, high lock‑in).
  2. Open‑agent marketplaces: Neutral agents operate across stores via open protocols and standardized APIs (low friction, more competition).
  3. Merchant‑provided agents: Brands build agents that advocate for their catalog (better margins for incumbents, more direct control).

Which model wins matters for competition and consumer welfare. Open protocols (agentic commerce specs) can prevent single-player dominance, but they require agreement on attribution, payment flows, and safety. Without standards, agentic marketplaces risk recreating walled gardens — but with even more leverage, because agents can automatically shift spending.

Safety, trust, and regulation

Agentic commerce compounds familiar AI concerns:

  • Fraud and misrepresentation: agents acting without clear provenance can impersonate buyers or manipulate seller terms.
  • Privacy leakage: agents need purchase history and preferences; poor controls can expose sensitive data.
  • Consumer choice erosion: agents optimizing for fees or commissions may prioritize partner merchants over the user’s best option.

Policy signals from Europe’s push for sovereign AI and labeling requirements suggest regulators will pay close attention. Product teams should bake transparency into agent decisions (explainability, logs) and provide user controls to inspect and override agent actions.

What product teams should do this quarter

  • Publish machine‑readable product metadata and build or expose lightweight APIs for inventory and pricing updates.
  • Instrument agent‑level analytics: track agent recommendations, acceptance rates, and dispute frequency.
  • Design clear consent flows and a visible audit trail so users can review and revoke agent permissions.
  • Experiment with agent economics: consider revenue share, subscription, or value‑based pricing rather than purely commission models.
  • Engage with standards bodies and industry groups to help shape open agent protocols.

Conclusion

Agentic commerce is an inflection point: it promises better personalization and automation, but also concentrates power in whoever controls agents and their standards. Companies that move quickly to make their catalogs agent‑friendly, insist on transparency, and participate in open protocols will avoid being treated as commodities by third‑party agents. For policymakers, the goal should be enabling competition and protecting consumers without stifling innovation.

Key Takeaways
– Agentic AI (shopping agents) shifts value from product pages to agent ecosystems — businesses must rethink distribution, pricing, and trust.
– Open protocols, strong attribution, and safety guardrails are essential to avoid platform lock‑in, fraud, and degraded consumer choice.

OpenAI’s Platform Moment: DevDay, the AMD Pact, and What Sora 2 Signals for Product Teams

How recent moves – an app marketplace, a multi‑year chip pact, and media‑grade video models – are reshaping AI product strategy, supply chains, and go‑to‑market tactics

Hero image

Introduction

In the span of a few news cycles, OpenAI’s public posture shifted from model research leader to deliberate platform builder and industrial buyer. Announcements around an app‑style marketplace and SDK at DevDay, a multi‑year chip supply pact with AMD, and commercial use cases for its Sora 2 video model together point to a more vertically integrated – and commercialized – AI future.

This piece walks through what those moves mean for product managers, engineering leaders, and startup founders, and suggests practical next steps for teams that either build on OpenAI’s stack or compete in adjacent markets.

What DevDay’s “apps inside ChatGPT” really means

  • The mechanics: OpenAI introduced an apps directory and developer SDK to let third‑party functionality plug directly into ChatGPT. That’s a distribution channel (and discovery layer) that bypasses traditional app stores and websites.
  • Product implications:
    • Distribution: Getting inside a popular conversational surface can massively shrink acquisition friction for conversational experiences and micro‑apps.
    • Monetization: Built‑in billing and exposure from the platform can accelerate business models for small teams, but also centralizes take‑rates and platform policy risk.
    • Expectations: Users will expect low latency, safe defaults, and consistent UX across “apps” – a higher bar than standalone chatbots historically faced.
  • For teams: Start by prototyping a minimal, high‑value integration (e.g., scheduling, data lookup, vertical workflows) and measure retention via platform metrics. Treat the SDK pathway as both product distribution and feature gating – be ready to iterate on safety and privacy constraints imposed by the platform.

The AMD supply pact: compute is a strategic asset

  • Why it matters: Long‑term, high‑volume chip and memory agreements are a hedge against capacity shortages and price volatility. Companies that secure deterministic access to silicon gain predictability for training and inference roadmaps.
  • Market effects:
    • Capital allocation: Deals like this can shift where model training happens (partner data centers vs. cloud regions) and tilt economics in favor of players who can lock capacity earlier.
    • Competitive dynamics: When platform providers secure supply and optional equity/warrants, it increases barriers to entry for smaller model builders and reshapes supplier bargaining power.
  • For engineering leaders: Factor potential spot market volatility into your capacity planning. If you rely on cloud GPUs, build flexible job queues, fallbacks to cheaper instance types for non‑critical workloads, and batch strategies for training to optimize usable throughput.

Sora 2 and the rapid productization of media AI

  • Sora 2 and similar video‑capable models are turning cinematic/creative capabilities from research demos into product features accessible to non‑creatives.
  • Product opportunities:
    • New verticals: E‑commerce, toys, marketing creative, app studios, and in‑product demos can embed model‑generated video as a differentiator.
    • Workflow integration: For teams focused on content pipelines, the key is not only generation but editability, style consistency, and rights management.
  • Risks: Quality expectations, hallucinations in generated content, and IP or safety gaps are magnified in media outputs. Companies integrating video generation need clear review workflows and provenance tracking.

Strategic themes to watch

  • Platformization: Conversational layers are becoming app platforms. That’s good for discoverability, but raises questions about governance, revenue share, and competitive neutrality.
  • Vertical integration of supply: Control of compute and memory is now part of product strategy, not just ops. Expect more long‑term supply agreements and financial instruments tied to hardware.
  • Faster commercialization: Models are crossing from lab to product faster than ever – which rewards tight product feedback loops, domain expertise, and strong safety tooling.

Practical next steps for teams

  • Product managers: Identify 1–2 high‑value “micro‑apps” that could live inside a conversational surface. Define success metrics (activation, retention, conversion) and run a small pilot via the SDK.
  • Engineering: Create a capacity playbook – spot vs. reserved vs. partner provisioned – and build autoscaling and batching to smooth costs.
  • Legal & compliance: Draft content provenance and review policies for any generated media. Ensure contractual clarity on data sharing when integrating platform SDKs.
  • Startup founders: Evaluate whether building on the platform accelerates go‑to‑market or risks strategic dependence. Consider hybrid approaches: platform presence for acquisition and standalone product for control.

Conclusion

OpenAI’s recent moves – platform features that turn ChatGPT into an app surface, long‑term hardware arrangements, and richer media models – are a compact case study in how AI is maturing from research projects into industrialized product ecosystems. For product, engineering, and leadership teams, the practical implication is clear: productize fast, plan compute strategically, and bake governance into every integration.

Hello world!

Welcome to WordPress. This is your first post. Edit or delete it, then start writing!I’m a Vim enthusiast and tab advocate, finding unmatched efficiency in Vim’s keystroke commands and tabs’ flexibility for personal viewing preferences. This extends to my support for static typing, where its early error detection ensures cleaner code, and my preference for dark mode, which eases long coding sessions by reducing eye strain.