How Prophet's Agent Works
A technical look at how our AI agent sees web pages, makes decisions, and automates browser tasks. Built with the Anthropic API with tool use and a custom agent loop.
The Agent Loop
Prophet's agent follows a continuous loop to understand and interact with web pages:
Observe
Take accessibility tree snapshot
Think
Claude analyzes page & task
Act
Execute tool calls on browser
Repeat
Continue until task complete
The agent can execute up to 10 tool calls per conversation to prevent runaway behavior. Each action is logged in the chat for full transparency.
How the Agent "Sees" Pages
Instead of raw HTML, Prophet uses the browser's accessibility tree - a structured representation of interactive elements that's perfect for AI understanding.
• Focuses only on interactive elements (buttons, links, inputs)
• Filters out visual noise and decorative elements
• Provides semantic roles and names
• More stable than CSS selectors across page changes
• Same data structure screen readers use
• Each element gets an 8-character UID
• UIDs injected as data-prophet-nodeid attributes
• Stable across snapshots for the same elements
• Agent uses UIDs to target specific elements
• UIDs are internal-only (never exposed to users)
Chrome DevTools Protocol
Prophet uses the Chrome DevTools Protocol (CDP) - the same technology that powers Chrome DevTools. This provides low-level browser control.
- Direct DOM access and manipulation
- Mouse and keyboard event simulation
- Page navigation control
- Network request interception
- Accessibility tree inspection
- Tab management
Agent Tools (19 Available)
Claude has access to 19 specialized tools for browser automation. Each tool is designed for a specific type of interaction:
Observation
take_snapshot
Captures the accessibility tree - how the agent "sees" the page with all interactive elements.
get_page_content
Extracts the cleaned text content of the current page.
search_snapshot
Searches the accessibility tree for specific elements by text.
get_page_info
Gets metadata about the current page (URL, title, viewport).
Interaction
click_element_by_uid
Clicks buttons, links, checkboxes using unique identifiers from the snapshot.
fill_element_by_uid
Types into text inputs, textareas, and form fields.
hover_element_by_uid
Hovers over elements to reveal dropdowns and tooltips.
Navigation
navigate
Navigates the browser to a specific URL.
scroll_page
Scrolls the page in any direction to reveal more content.
go_back
Navigates back in browser history.
go_forward
Navigates forward in browser history.
reload_page
Reloads the current page.
Wait
wait_for_selector
Waits for dynamic content to load (for SPAs like React/Vue).
wait_for_navigation
Waits for page navigation to complete before proceeding.
wait_for_timeout
Pauses execution for a specified duration.
Tabs
list_tabs
Lists all open browser tabs.
switch_tab
Switches focus to a specific tab.
close_tab
Closes a specific tab.
open_new_tab
Opens a URL in a new tab.
Decision Making with Claude
Prophet uses Anthropic Claude as the reasoning engine. Users can select from three Claude 4.5 models:
Haiku 4.5
Fast & efficient for simple tasks
Sonnet 4.5
Balanced performance & capability
Opus 4.5
Most capable for complex tasks
Here's how Claude makes decisions:
Architecture Overview
Prophet uses a custom browser automation architecture built on the Anthropic API with tool use. The system has three main components:
Runs in your browser, manages the agent loop, and executes tools locally using Chrome DevTools Protocol.
Handles authentication, rate limiting, and billing. Streams Claude's responses to the extension.
Receives page context and returns intelligent actions (clicks, typing, navigation) for the browser.
Unlike screenshot-based approaches (Computer Use, Claude in Chrome), Prophet uses the accessibility tree:
• Fast - No image processing or vision models
• Deterministic - UIDs target exact elements
• Efficient - Less tokens than screenshots
• Reliable - Same approach as Playwright MCP
Prophet implements its own agent loop instead of using Claude Agent SDK:
• Browser Context - Tools run in your logged-in session
• No Dependencies - No Claude Code CLI required
• Full Control - Custom tool execution via CDP
• Security - Tool execution isolated from backend
Why Client-Side Tool Execution?
Prophet executes tools inside your browser (client-side) rather than on a server. This is a critical design choice that enables browser automation.
Browser automation tools need access to the Chrome DevTools Protocol (CDP) to:
• Control the browser (click, type, scroll)
• Read page state (accessibility tree, element properties)
• Manage tabs and navigation
CDP is only available in Chrome extensions - not on backend servers.
Running tools in your browser means:
• Your session, your control - Automation happens in your logged-in browser, not a separate instance
• Security - Backend never sees what you're browsing
• Privacy - Page content stays local to your machine
• No dependencies - No separate browser instances needed
This architecture choice is what makes Prophet different from server-side tools like web scrapers or coding agents. For more details on when to use client-side vs server-side tool execution, see our Architecture Guide.
Prophet vs Claude in Chrome
Anthropic offers Claude in Chrome, their official browser extension. Here's how Prophet's approach differs:
| Feature | Claude in Chrome | Prophet |
|---|---|---|
| How it "sees" pages | Screenshots (vision model) | Accessibility tree (structured data) |
| Speed | "Noticeably slower" - screenshot/analyze cycle | Fast - direct element targeting |
| Vision model | Required | Not needed |
| Element targeting | Coordinate-based (probabilistic) | UID-based (deterministic) |
| Token usage | High (images are expensive) | Low (structured text) |
| Infrastructure | Anthropic's servers | Your own backend (full control) |
| Billing | Claude subscription ($20-200/mo) | Pay-per-use credits |
Key insight: Prophet's accessibility tree approach is the same method used by Playwright MCP, which states: "Rather than relying on screenshots, it generates structured accessibility snapshots... making interactions more deterministic and efficient."
Learn More
Anthropic API - Tool Use
Official documentation on how Claude processes and executes tools.
platform.claude.com →Playwright MCP
Microsoft's MCP server using the same accessibility tree approach.
github.com/microsoft/playwright-mcp →Claude in Chrome
Anthropic's official browser extension using screenshot-based approach.
anthropic.com →Chrome DevTools Protocol
The low-level protocol Prophet uses for browser automation.
chromedevtools.github.io →