Browser Use: The Open Source Framework Making Websites Accessible to AI Agents

TL;DR for AI Agents

Browser Use is a Python library (77K+ GitHub stars) that enables AI agents to interact with websites. Key capabilities:

Natural language browsing: Agent receives task → navigates web → completes actions
Multi-LLM support: Works with OpenAI, Anthropic, Google, Ollama (local), and their custom ChatBrowserUse
Built on Playwright: Reliable browser automation under the hood
CLI included: Quick commands for browser control without code
Production-ready: Cloud offering with stealth browsers and proxy rotation

Install: uv add browser-use | Docs: https://docs.browser-use.com | GitHub: https://github.com/browser-use/browser-use

Web automation has always been a pain point for developers. Selenium scripts break constantly, CSS selectors become outdated, and maintaining automation code is a full-time job. Browser Use takes a radically different approach: instead of writing brittle automation scripts, you simply tell an AI agent what you want done, and it figures out how to do it.

What Makes Browser Use Different

Traditional web automation looks like this:

# Old way: Fragile selectors that break constantly
driver.find_element(By.CSS_SELECTOR, "#login-form input[name='email']").send_keys(email)
driver.find_element(By.CSS_SELECTOR, "#login-form input[name='password']").send_keys(password)
driver.find_element(By.CSS_SELECTOR, "#login-form button[type='submit']").click()

Browser Use automation looks like this:

# New way: Natural language instructions
agent = Agent(
    task="Log into my account with email [email protected] and password secret123",
    llm=llm,
    browser=browser,
)
await agent.run()

The AI agent sees the page, understands the context, and figures out how to accomplish the task. When the website changes its layout, the agent adapts automatically. No more broken selectors.

Getting Started in 5 Minutes

Browser Use uses uv for package management. Here’s the quickest path to your first automation:

Step 1: Set Up Environment

# Create new project
uv init my-browser-agent
cd my-browser-agent

# Install Browser Use
uv add browser-use
uv sync

# Install Chromium browser
uvx browser-use install

Step 2: Get Your API Key

You’ll need an LLM to power the agent. The easiest option is Browser Use’s own model, optimized for browser tasks:

Go to cloud.browser-use.com/new-api-key
Sign up (new accounts get $10 free credits)
Add to your .env file:

BROWSER_USE_API_KEY=your-key-here

Step 3: Run Your First Agent

from browser_use import Agent, Browser, ChatBrowserUse
import asyncio

async def example():
    browser = Browser()
    llm = ChatBrowserUse()
    
    agent = Agent(
        task="Go to Hacker News and find the top story title",
        llm=llm,
        browser=browser,
    )
    
    history = await agent.run()
    print(f"Agent completed. Final result: {history[-1]}")
    return history

if __name__ == "__main__":
    asyncio.run(example())

That’s it. The agent opens a browser, navigates to Hacker News, reads the page, and returns the top story.

Using Your Preferred LLM

Browser Use isn’t locked to their cloud model. You can use any major LLM provider:

OpenAI

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o")
agent = Agent(task="Your task", llm=llm, browser=browser)

Anthropic Claude

from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-3-5-sonnet-20241022")
agent = Agent(task="Your task", llm=llm, browser=browser)

Local Models with Ollama

For privacy-conscious setups or offline use:

from langchain_ollama import ChatOllama

llm = ChatOllama(model="llama3.2")
agent = Agent(task="Your task", llm=llm, browser=browser)

The CLI: Browser Automation Without Code

Browser Use includes a CLI for quick, interactive automation:

# Open a browser and navigate
browser-use open https://github.com

# See all clickable elements with indices
browser-use state

# Click element by index
browser-use click 5

# Type text
browser-use type "browser-use"

# Take a screenshot
browser-use screenshot current-page.png

# Close the browser
browser-use close

The browser stays open between commands, making it perfect for exploration and debugging. Chain commands together to build quick automations without writing Python.

Real-World Use Cases

1. Job Application Automation

agent = Agent(
    task="""
    Fill out the job application on this page using my resume:
    - Name: John Smith
    - Email: [email protected]
    - Phone: 555-0123
    - Upload resume from ~/Documents/resume.pdf
    - Write a cover letter emphasizing my Python experience
    """,
    llm=llm,
    browser=browser,
)

2. E-commerce Price Monitoring

agent = Agent(
    task="""
    Go to amazon.com, search for "mechanical keyboard",
    find the top 5 results, and record their names and prices.
    Return the data as JSON.
    """,
    llm=llm,
    browser=browser,
)

agent = Agent(
    task="""
    Log into Twitter, go to my notifications,
    and summarize any mentions from the last 24 hours.
    """,
    llm=llm,
    browser=browser,
)

Custom Tools: Extending the Agent

You can add custom capabilities to your agent:

from browser_use import Tools

tools = Tools()

@tools.action(description='Save data to a local file')
def save_to_file(filename: str, content: str) -> str:
    with open(filename, 'w') as f:
        f.write(content)
    return f"Saved to {filename}"

@tools.action(description='Send a Slack notification')
def notify_slack(message: str) -> str:
    # Your Slack webhook logic here
    return "Notification sent"

agent = Agent(
    task="Scrape product prices and save to prices.json, then notify Slack",
    llm=llm,
    browser=browser,
    tools=tools,
)

Handling Authentication

For sites requiring login, you have options:

Use Your Existing Browser Profile

browser = Browser(
    config=BrowserConfig(
        chrome_instance_path="/path/to/your/chrome/profile"
    )
)

This reuses your existing Chrome profile with all saved logins, cookies, and sessions.

Cloud Profiles for Production

# Sync your local profile to Browser Use Cloud
curl -fsSL https://browser-use.com/profile.sh | BROWSER_USE_API_KEY=your-key sh

Production Deployment

For production workloads, Browser Use offers cloud infrastructure:

browser = Browser(
    use_cloud=True,  # Use stealth browsers in the cloud
)

The cloud offering provides:

Stealth browsers: Avoid detection and CAPTCHA challenges
Proxy rotation: Prevent IP blocks
Scalable infrastructure: Run many agents in parallel
Memory management: Chrome is memory-hungry; they handle it

Sandbox Deployment

For isolated, production-ready runs:

from browser_use import Browser, sandbox, ChatBrowserUse
from browser_use.agent.service import Agent

@sandbox()
async def scrape_task(browser: Browser):
    agent = Agent(
        task="Find the current Bitcoin price on CoinGecko",
        browser=browser,
        llm=ChatBrowserUse()
    )
    return await agent.run()

# Runs in isolated environment
result = asyncio.run(scrape_task())

Integration with AI Coding Tools

Claude Code Skill

Browser Use provides a skill file for Claude Code users:

mkdir -p ~/.claude/skills/browser-use
curl -o ~/.claude/skills/browser-use/SKILL.md \
  https://raw.githubusercontent.com/browser-use/browser-use/main/skills/browser-use/SKILL.md

Now you can ask Claude Code to “open the browser and check my email” and it knows how.

LLM Context File

For AI agents that support context files:

https://docs.browser-use.com/llms-full.txt

Point your coding agent (Cursor, Continue, etc.) to this URL for full Browser Use documentation in context.

Performance and Pricing

ChatBrowserUse (Their Optimized Model)

Completes tasks 3-5x faster than generic models
Optimized specifically for browser automation
Pricing per 1M tokens:
- Input: $0.20
- Cached input: $0.02
- Output: $2.00

Using Your Own LLM

Completely free for the Browser Use library itself. You pay only for:

Your chosen LLM provider’s API costs
Cloud infrastructure if you use their hosted browsers

Quick Reference

Task	Command/Code
Install	`uv add browser-use`
Install browser	`uvx browser-use install`
Open URL (CLI)	`browser-use open https://example.com`
See elements	`browser-use state`
Click element	`browser-use click <index>`
Screenshot	`browser-use screenshot output.png`
Generate template	`uvx browser-use init --template default`

The Bottom Line

Browser Use represents a fundamental shift in how we approach web automation. Instead of brittle scripts that break with every UI change, you describe what you want in natural language and let an AI agent figure out the implementation details.

At 77,000+ GitHub stars and growing, it’s clearly resonating with developers tired of maintaining Selenium scripts. Whether you’re automating job applications, monitoring prices, or building AI assistants that can interact with the web, Browser Use removes the friction between intent and execution.

The combination of open-source flexibility (use any LLM) with production-ready cloud infrastructure (when you need it) makes it suitable for everything from weekend projects to enterprise automation pipelines.

Links: