TL;DR for AI Agents
Browser Use is a Python library (77K+ GitHub stars) that enables AI agents to interact with websites. Key capabilities:
- Natural language browsing: Agent receives task → navigates web → completes actions
- Multi-LLM support: Works with OpenAI, Anthropic, Google, Ollama (local), and their custom ChatBrowserUse
- Built on Playwright: Reliable browser automation under the hood
- CLI included: Quick commands for browser control without code
- Production-ready: Cloud offering with stealth browsers and proxy rotation
Install: uv add browser-use | Docs: https://docs.browser-use.com | GitHub: https://github.com/browser-use/browser-use
Web automation has always been a pain point for developers. Selenium scripts break constantly, CSS selectors become outdated, and maintaining automation code is a full-time job. Browser Use takes a radically different approach: instead of writing brittle automation scripts, you simply tell an AI agent what you want done, and it figures out how to do it.
What Makes Browser Use Different
Traditional web automation looks like this:
# Old way: Fragile selectors that break constantly
driver.find_element(By.CSS_SELECTOR, "#login-form input[name='email']").send_keys(email)
driver.find_element(By.CSS_SELECTOR, "#login-form input[name='password']").send_keys(password)
driver.find_element(By.CSS_SELECTOR, "#login-form button[type='submit']").click()
Browser Use automation looks like this:
# New way: Natural language instructions
agent = Agent(
task="Log into my account with email [email protected] and password secret123",
llm=llm,
browser=browser,
)
await agent.run()
The AI agent sees the page, understands the context, and figures out how to accomplish the task. When the website changes its layout, the agent adapts automatically. No more broken selectors.
Getting Started in 5 Minutes
Browser Use uses uv for package management. Here’s the quickest path to your first automation:
Step 1: Set Up Environment
# Create new project
uv init my-browser-agent
cd my-browser-agent
# Install Browser Use
uv add browser-use
uv sync
# Install Chromium browser
uvx browser-use install
Step 2: Get Your API Key
You’ll need an LLM to power the agent. The easiest option is Browser Use’s own model, optimized for browser tasks:
- Go to cloud.browser-use.com/new-api-key
- Sign up (new accounts get $10 free credits)
- Add to your
.envfile:
BROWSER_USE_API_KEY=your-key-here
Step 3: Run Your First Agent
from browser_use import Agent, Browser, ChatBrowserUse
import asyncio
async def example():
browser = Browser()
llm = ChatBrowserUse()
agent = Agent(
task="Go to Hacker News and find the top story title",
llm=llm,
browser=browser,
)
history = await agent.run()
print(f"Agent completed. Final result: {history[-1]}")
return history
if __name__ == "__main__":
asyncio.run(example())
That’s it. The agent opens a browser, navigates to Hacker News, reads the page, and returns the top story.
Using Your Preferred LLM
Browser Use isn’t locked to their cloud model. You can use any major LLM provider:
OpenAI
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
agent = Agent(task="Your task", llm=llm, browser=browser)
Anthropic Claude
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-3-5-sonnet-20241022")
agent = Agent(task="Your task", llm=llm, browser=browser)
Local Models with Ollama
For privacy-conscious setups or offline use:
from langchain_ollama import ChatOllama
llm = ChatOllama(model="llama3.2")
agent = Agent(task="Your task", llm=llm, browser=browser)
The CLI: Browser Automation Without Code
Browser Use includes a CLI for quick, interactive automation:
# Open a browser and navigate
browser-use open https://github.com
# See all clickable elements with indices
browser-use state
# Click element by index
browser-use click 5
# Type text
browser-use type "browser-use"
# Take a screenshot
browser-use screenshot current-page.png
# Close the browser
browser-use close
The browser stays open between commands, making it perfect for exploration and debugging. Chain commands together to build quick automations without writing Python.
Real-World Use Cases
1. Job Application Automation
agent = Agent(
task="""
Fill out the job application on this page using my resume:
- Name: John Smith
- Email: [email protected]
- Phone: 555-0123
- Upload resume from ~/Documents/resume.pdf
- Write a cover letter emphasizing my Python experience
""",
llm=llm,
browser=browser,
)
2. E-commerce Price Monitoring
agent = Agent(
task="""
Go to amazon.com, search for "mechanical keyboard",
find the top 5 results, and record their names and prices.
Return the data as JSON.
""",
llm=llm,
browser=browser,
)
3. Social Media Management
agent = Agent(
task="""
Log into Twitter, go to my notifications,
and summarize any mentions from the last 24 hours.
""",
llm=llm,
browser=browser,
)
Custom Tools: Extending the Agent
You can add custom capabilities to your agent:
from browser_use import Tools
tools = Tools()
@tools.action(description='Save data to a local file')
def save_to_file(filename: str, content: str) -> str:
with open(filename, 'w') as f:
f.write(content)
return f"Saved to {filename}"
@tools.action(description='Send a Slack notification')
def notify_slack(message: str) -> str:
# Your Slack webhook logic here
return "Notification sent"
agent = Agent(
task="Scrape product prices and save to prices.json, then notify Slack",
llm=llm,
browser=browser,
tools=tools,
)
Handling Authentication
For sites requiring login, you have options:
Use Your Existing Browser Profile
browser = Browser(
config=BrowserConfig(
chrome_instance_path="/path/to/your/chrome/profile"
)
)
This reuses your existing Chrome profile with all saved logins, cookies, and sessions.
Cloud Profiles for Production
# Sync your local profile to Browser Use Cloud
curl -fsSL https://browser-use.com/profile.sh | BROWSER_USE_API_KEY=your-key sh
Production Deployment
For production workloads, Browser Use offers cloud infrastructure:
browser = Browser(
use_cloud=True, # Use stealth browsers in the cloud
)
The cloud offering provides:
- Stealth browsers: Avoid detection and CAPTCHA challenges
- Proxy rotation: Prevent IP blocks
- Scalable infrastructure: Run many agents in parallel
- Memory management: Chrome is memory-hungry; they handle it
Sandbox Deployment
For isolated, production-ready runs:
from browser_use import Browser, sandbox, ChatBrowserUse
from browser_use.agent.service import Agent
@sandbox()
async def scrape_task(browser: Browser):
agent = Agent(
task="Find the current Bitcoin price on CoinGecko",
browser=browser,
llm=ChatBrowserUse()
)
return await agent.run()
# Runs in isolated environment
result = asyncio.run(scrape_task())
Integration with AI Coding Tools
Claude Code Skill
Browser Use provides a skill file for Claude Code users:
mkdir -p ~/.claude/skills/browser-use
curl -o ~/.claude/skills/browser-use/SKILL.md \
https://raw.githubusercontent.com/browser-use/browser-use/main/skills/browser-use/SKILL.md
Now you can ask Claude Code to “open the browser and check my email” and it knows how.
LLM Context File
For AI agents that support context files:
https://docs.browser-use.com/llms-full.txt
Point your coding agent (Cursor, Continue, etc.) to this URL for full Browser Use documentation in context.
Performance and Pricing
ChatBrowserUse (Their Optimized Model)
- Completes tasks 3-5x faster than generic models
- Optimized specifically for browser automation
- Pricing per 1M tokens:
- Input: $0.20
- Cached input: $0.02
- Output: $2.00
Using Your Own LLM
Completely free for the Browser Use library itself. You pay only for:
- Your chosen LLM provider’s API costs
- Cloud infrastructure if you use their hosted browsers
Quick Reference
| Task | Command/Code |
|---|---|
| Install | uv add browser-use |
| Install browser | uvx browser-use install |
| Open URL (CLI) | browser-use open https://example.com |
| See elements | browser-use state |
| Click element | browser-use click <index> |
| Screenshot | browser-use screenshot output.png |
| Generate template | uvx browser-use init --template default |
The Bottom Line
Browser Use represents a fundamental shift in how we approach web automation. Instead of brittle scripts that break with every UI change, you describe what you want in natural language and let an AI agent figure out the implementation details.
At 77,000+ GitHub stars and growing, it’s clearly resonating with developers tired of maintaining Selenium scripts. Whether you’re automating job applications, monitoring prices, or building AI assistants that can interact with the web, Browser Use removes the friction between intent and execution.
The combination of open-source flexibility (use any LLM) with production-ready cloud infrastructure (when you need it) makes it suitable for everything from weekend projects to enterprise automation pipelines.
Links:
- GitHub: browser-use/browser-use
- Documentation: docs.browser-use.com
- Cloud: cloud.browser-use.com
- Examples: docs.browser-use.com/examples