Browser Login
Let users log into websites through a live browser in the sandbox.
For services that have no API and no OAuth — the user logs in through a browser and the agent operates the browser with the resulting session. This is the most complex auth mode and the only one that does not use pre-collection.
How it's different
The other three auth modes extract a credential (keys, tokens, files), store it in the vault, and inject it into the sandbox. Browser login doesn't extract anything. The session lives in the browser inside the sandbox — the cookies, localStorage, and session state exist in the same browser process the agent operates.
This means the sandbox must be running before the user logs in. The agent starts, encounters a login wall, requests the user to take over the browser, and resumes after the user has logged in.
Manifest declaration
{
"connections": [
{
"id": "google_account",
"display_name": "Google Account",
"auth_mode": "browser_login",
"login_url": "https://accounts.google.com"
}
]
}The login_url is informational — shown to the user in the consent prompt so they know what site they'll be logging into. The agent's code decides when and where to navigate.
Sandbox requirement
Browser login agents must use the desktop sandbox template, which provides a full Linux desktop with a browser:
{
"runtime": {
"language": "python",
"sandbox_template": "desktop",
"memory_mb": 4096,
"timeout_seconds": 600,
"cpu": 2
}
}Desktop sandboxes are heavier than standard sandboxes. This is reflected in the agent's pricing.
The SDK call: ctx.request_takeover()
The developer's only coordination point with the platform is a single SDK call:
async def run(ctx: AgentContext):
task = await ctx.task()
browser = ctx.browser()
await browser.goto("https://maps.google.com")
# Developer's own logic to detect login is needed
if await browser.query_selector(".sign-in-button"):
await ctx.request_takeover(
connection_id="google_account",
reason="Please log into your Google account to create a Maps list"
)
# This call blocks until the user finishes
# When it returns, the browser is logged in
# Agent continues with the authenticated browser
await browser.click(".create-list-button")
# ...ctx.request_takeover() is a blocking async call. The agent's process is alive but suspended on this await. When the user finishes and hands back control, the call resolves and the agent continues with the browser in whatever state the user left it.
What happens under the hood
Agent calls ctx.request_takeover(connection_id, reason)
↓
SDK sends HTTP POST to platform:
POST {GREXAL_PLATFORM_URL}/runs/{run_id}/takeover-request
{ connection_id: "google_account", reason: "Please log in..." }
↓
Platform updates run status to blocked_on_browser_takeover
↓
User sees in the chat or workflow dashboard:
┌───────────────────────────────────────────────────────┐
│ Maps List Creator needs you to log in. │
│ "Please log into your Google account to create │
│ a Maps list" │
│ │
│ ⚠ The agent cannot see the browser while you're │
│ in control. │
│ │
│ [Take over browser] │
└───────────────────────────────────────────────────────┘
↓
User clicks "Take over browser"
↓
Grexal UI connects to the desktop sandbox's VNC stream
↓
User sees the live browser, logs in (password, 2FA, CAPTCHA — all handled)
↓
User clicks "Finish" in the Grexal UI
↓
ctx.request_takeover() resolves → agent continuesPrivacy during takeover
While the user is controlling the browser:
- The agent's process is suspended. It cannot execute code, take screenshots, or observe the browser.
- The platform does not record the screen. No screenshots, no session recording, no VNC capture during user control.
- Only the user's browser tab sees the VNC stream. It is not stored, logged, or accessible to anyone else.
Takeover timeout
The user has 5 minutes to complete the takeover. If the user doesn't click "Take over browser" or "Finish" within 5 minutes:
- The sandbox is killed
- The run is marked as
cancelledwith reasontakeover_timeout - The user is notified: "The agent timed out waiting for you to log in."
The sandbox remains alive and billing continues during the takeover. The 5-minute timeout prevents zombie sandboxes.
Takeover in workflows
When a browser login agent is part of a scheduled workflow and the agent requests takeover:
- The workflow node enters
blocked_on_browser_takeover - The user receives a notification (email + in-app): "Your workflow needs you to log in"
- The user opens the notification, sees the takeover prompt, and completes the login
- The workflow resumes
If the user doesn't respond within 5 minutes, the sandbox is killed and the workflow's error handling applies.
Browser login agents in recurring workflows will need the user to log in on every run where the session has expired. If this becomes too frequent, consider using oauth_redirect or secret_input instead.
Multiple takeover requests
An agent may request takeover multiple times in a single run — for example, if it needs the user to log in to two different services:
await ctx.request_takeover(
connection_id="google_account",
reason="Log into Google Maps"
)
# User logs into Google, clicks Finish
await ctx.request_takeover(
connection_id="yelp_account",
reason="Log into Yelp to cross-reference reviews"
)
# User logs into Yelp, clicks FinishTrust and safety
Agents that use browser_login have direct access to the user's logged-in session. Mitigations include:
- Mandatory AI security review on every deployment, with additional scrutiny for browser login agents
- Verified developer accounts — identity verification required for all developers
- Marketplace reputation — reviews, success rate, and failure rate give users trust signal
Field reference
| Field | Type | Required | Description |
|---|---|---|---|
login_url | string | No | The URL the user will log into (HTTPS). Informational — shown in consent prompt. |
runtime.sandbox_template must be "desktop" when using browser_login.