🌐 Browser Automation

👤 Who it's for: Developers / integrators
⏱️ Read time: ~6 minutes
💡 In one line: Let YingClaw auto-open pages, click, fill forms, screenshot, and scrape — no GUI needed.

YingClaw has built-in headless browser automation, natively implemented in Rust (with Node.js as fallback), enabling web operations in environments without a graphical interface.

Capabilities

YingClaw's browser automation supports:

🧭 Page Navigation: Open URLs, navigate forward/backward
🖱️ Element Interaction: Click buttons, fill input fields, select dropdowns
📸 Screenshots: Capture full-page or specific regions
📄 Content Extraction: Extract page text, tables, structured data
🔐 Login Flows: Auto-fill forms, handle authentication

Usage: agent-browser Skill

Browser automation is triggered through the agent-browser skill. YingClaw automatically loads it when it detects web-related tasks in conversation.

Key Commands

Command	Function	Example
`navigate`	Navigate to URL	`navigate url="https://example.com"`
`click`	Click a page element	`click selector="#submit-btn"`
`type`	Type into an input	`type selector="#search" text="YingClaw"`
`snapshot`	Get page accessibility snapshot	`snapshot`
`screenshot`	Capture a screenshot	`screenshot filename="result.png"`

Use Cases

Scenario	Description
🌐 Web scraping	Auto-extract page content, tables, lists
🧪 Automated testing	Simulate user workflows, verify page functionality
📝 Form filling	Auto-fill and submit online forms
🔐 Login flows	Handle authentication-required website operations
📊 Data monitoring	Periodically check web page content changes

Examples

Typical Workflow

> "Open Baidu, search for YingClaw, and screenshot the results"

YingClaw executes:
1. navigate → Open Baidu homepage
2. type → Enter "YingClaw" in the search box
3. click → Click the search button
4. screenshot → Capture the results page

Data Extraction

> "Extract all product names and prices from the table on this page"

YingClaw executes:
1. navigate → Open the target page
2. snapshot → Get page structure
3. Extract table data and return formatted results

Notes

Note	Description
⏱️ Network timeout	Default 30 seconds for page load; increase for complex pages
🔐 Login state	Browser context persists during the session, maintaining login state
🧩 CAPTCHAs	Image CAPTCHA auto-handling is limited; manual intervention may be needed
📦 Resource loading	Images and videos are not loaded by default in headless mode for speed
🔒 Security limits	Some sites may detect and block headless browsers

Best Practices

Snapshot before action: Get page snapshot to confirm element selectors before interacting
Add waits: Wait for elements to be ready on dynamically-loaded pages
Error retry: Auto-retry failed navigation on unstable networks
Session reuse: Login state within the same session persists across pages

Next Steps

Browser automation connects YingClaw to the web. Next, learn about 🤖 Multi-Agent Orchestration for complex task handling.

Capabilities​

Usage: agent-browser Skill​

Key Commands​

Use Cases​

Examples​

Typical Workflow​

Data Extraction​

Notes​

Best Practices​

Next Steps​