Build Your Own Web-Browsing AI Agent with MCP and OpenAI gpt-oss
A hands-on guide for junior developers, content creators, and curious minds
Table of Contents
-
Why This Guide Exists -
What You Will Build -
Background: The MCP Ecosystem -
Prerequisites: Tools & Accounts -
Project 1: Local Browser Agent -
Project 2: Hugging Face MCP Hub -
Frequently Asked Questions -
Next Steps & Roadmap
Why This Guide Exists
If you have ever wished for an assistant that can open web pages, grab the latest AI model rankings, and even create images for your blog—all without you touching a browser—this tutorial is for you.
We will use MCP (Model-Context-Protocol) and OpenAI gpt-oss-120B through Fireworks AI. All code is already inside the /browser-agent
and /hf-mcp-server
directories of the repository, so you can copy-paste and start experimenting immediately.
What You Will Build
Project | Core Skill | End Result |
---|---|---|
Local Browser Agent | Web automation | An agent that browses, searches, screenshots, and reports back in plain English |
Hugging Face MCP Hub | AI-space orchestration | An agent that can call thousands of Hugging Face Spaces (text-to-image, text-to-video, etc.) |
No GPU is required; everything runs either locally or on managed services.
Background: The MCP Ecosystem
1. What is MCP?
MCP is a lightweight protocol that lets any language model use external tools through a standardized interface.
Think of it as a USB-C port for AI: one plug, many devices.
2. Three Moving Parts
Part | Real-World Analogy | Responsibility |
---|---|---|
MCP Server | Browser plug-in | Exposes one specific capability (e.g., open Chrome, call an API) |
MCP Client | Browser itself | Decides which plug-ins to load, passes user requests, returns results |
Agent | End user | Writes or speaks a request in natural language |
3. Why not use plain Function Calling?
Function Calling is tied to a single provider. MCP is provider-agnostic: today you run gpt-oss, tomorrow you swap to another model—no code change required.
Prerequisites: Tools & Accounts
Item | Version | Purpose |
---|---|---|
Node.js | 18 or higher | Runs the Playwright MCP Server |
Python | 3.9 or higher | Runs the Tiny Agents client (optional) |
Git | any | Clone the repository |
Hugging Face Token | free | Authenticates you for both demos |
Get Your Hugging Face Token
-
Visit https://huggingface.co/settings/tokens -
Create a token with Write permission -
In your terminal: huggingface-cli login
Paste the token when prompted.
Project 1: Local Browser Agent
Goal: In under 15 minutes, have an agent that can open websites, search, and take screenshots for you.
Step 0: Quick Checklist
-
[ ] Node 18+ installed -
[ ] Hugging Face token saved -
[ ] Terminal open in the project root
Step 1: Explore the Provided Files
All required files live in /browser-agent
:
-
agent.json
– tells the Tiny Agents client which model and tools to use -
PROMPT.md
– optional system prompt that tells the AI to plan, reflect, and never guess
Step 2: Understand agent.json
Open browser-agent/agent.json
:
{
"model": "openai/gpt-oss-120b",
"provider": "fireworks-ai",
"servers": [
{
"type": "stdio",
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
]
}
Key | Meaning |
---|---|
model |
The exact model string expected by Fireworks AI |
provider |
Where the inference actually happens |
servers |
A single entry that launches the Playwright MCP Server via Node |
The Playwright MCP Server exposes browser automation actions such as:
-
navigate(url)
-
screenshot()
-
click(selector)
-
type(selector, text)
Step 3: Install the Python Tiny Agents Client
If you prefer Python:
pip install -U "huggingface_hub[mcp]>=0.32.0"
If you prefer Node:
npm install -g @huggingface/tiny-agents
Step 4: Run the Agent
# Python
tiny-agents run ./browser-agent
# Node
npx @huggingface/tiny-agents run ./browser-agent
The first run downloads Chromium through Playwright; grab a coffee.
Step 5: Talk to Your Agent
Example prompt:
“Please open https://huggingface.co/models, sort by ‘Most Downloads this week’, grab the top 10 model names, and save a screenshot of the list.”
The agent will:
-
Launch a headless browser -
Navigate to the URL -
Parse the table -
Return a neat list plus a screenshot saved locally
Project 2: Hugging Face MCP Hub
Goal: Let your agent tap into thousands of AI Spaces on Hugging Face (text-to-image, text-to-video, audio, etc.).
Step 1: Register Spaces on Hugging Face
-
Visit https://hf.co/mcp -
Click “Add” next to any Space you like.
Popular picks:-
evalstate/FLUX.1-Krea-dev
– high-quality text-to-image -
evalstate/ltx-video-distilled
– text-to-video
-
-
Note your User Access Token (same as before).
Step 2: Create a New Folder
Create hf-mcp-server
and place inside:
{
"model": "openai/gpt-oss-120b",
"provider": "fireworks-ai",
"inputs": [
{
"type": "promptString",
"id": "hf-token",
"description": "Your Hugging Face Token",
"password": true
}
],
"servers": [
{
"type": "http",
"url": "https://huggingface.co/mcp",
"headers": {
"Authorization": "Bearer ${input:hf-token}"
}
}
]
}
New Element | Purpose |
---|---|
inputs |
Prompts you for the token at runtime—safer than hard-coding |
servers |
Points to Hugging Face’s MCP gateway, which forwards calls to the Spaces you registered |
Step 3: Run It
# Python
tiny-agents run ./hf-mcp-server
# Node
npx @huggingface/tiny-agents run ./hf-mcp-server
Step 4: Prompt Examples
“Using FLUX.1, generate a 1024×1024 image of an astronaut eating ramen on the moon. Use cinematic lighting.”
The agent will:
-
Ask for confirmation -
Call the Space -
Return a shareable URL to the generated image
Frequently Asked Questions
Q1: Do I need a GPU?
No. Inference happens on Fireworks AI’s cloud or Hugging Face’s cloud.
Q2: Is my token safe?
Yes. The token is never written to disk; it lives only in memory and is masked when typed.
Q3: Can I run both projects at the same time?
Absolutely. Each project has its own agent.json
and can be started in separate terminals.
Q4: How do I add a custom tool?
Write a small MCP Server in any language (templates are available in the MCP docs), then add one more entry under servers
.
Q5: What if I want to use a different model?
Change the model
and provider
fields in agent.json
. As long as the new provider supports MCP, no further code changes are needed.
Q6: Can I run this on Windows?
Yes. Both Node and Python clients are cross-platform.
Next Steps & Roadmap
Time Investment | Action | Outcome |
---|---|---|
30 min | Re-run both demos with your own prompts | Solid muscle memory |
1 day | Wrap your company’s REST API as an MCP Server | Internal AI assistant |
1 week | Chain multiple agents | Browser agent gathers data, Hugging Face agent creates visuals, database agent stores results |
Closing Thoughts
MCP turns the chaotic world of AI tools into tidy building blocks.
Today you connected two blocks—browser automation and Hugging Face Spaces—without changing a single line of server code.
Tomorrow you can snap in a database block, an email block, or a custom analytics block, and your agent will keep working exactly as before.
Clone the repo, run the commands, and start experimenting. The only limit is your imagination.