Mobile-Use: Let Your Phone Work for You—A Plain-English Global Guide
“Open Gmail, find the first three unread messages, and list the sender and subject line in JSON.”
Say it. Watch it happen.
1. What Exactly Is Mobile-Use?
Mobile-use is an open-source AI agent that drives your Android or iOS device with nothing more than natural language. You speak or type a request, and the program:
-
understands what you want -
interacts with the user interface exactly like a human would -
returns the result in the exact format you asked for—JSON, plain text, CSV, or even Markdown
No code, no macros, no complex scripting. Just words.
2. Why You Might Care
| Everyday Pain | Old Way | Mobile-Use Way |
|---|---|---|
| Exporting unread emails daily | Screenshot → OCR → spreadsheet | One sentence → JSON file |
| Tracking 5 apps’ daily active users | Open each app → scroll → copy numbers | One sentence → consolidated table |
| Helping parents use smartphones | Video call instructions | Read the sentence aloud → phone does it |
3. Core Capabilities in Plain English
-
Natural Language Control
Speak or type in any major language. The agent figures out the rest. -
UI-Aware Automation
It sees buttons, icons, and text the same way you do, so layout changes do not break the flow. -
Structured Data Extraction
Anything visible on screen can be scraped and delivered as JSON, CSV, Markdown, or plain text. -
Swappable AI Brain
Use OpenAI by default or swap in Claude, Gemini, or a local model by editing a single JSON file.
4. Benchmark Snapshot
Mobile-use is #1 on the open-source pass@1 leaderboard of the AndroidWorld benchmark.
Full leaderboard: Google Sheets link
5. Quick-Start Checklist
| Task | Android Physical | Android Emulator | iOS Simulator |
|---|---|---|---|
| Enable debugging | Settings → Developer Options → USB Debugging | Built-in | macOS + Xcode |
| Required tool | ADB | ADB | Xcode |
| First-time connection | USB cable + on-device prompt | None | None |
| Network | Same Wi-Fi as computer | Same subnet | Localhost |
6. Two Ways to Install
Route A: One-Line Docker (Beginner-Friendly)
Prerequisites
-
Docker installed -
Physical Android device or emulator on the same Wi-Fi network as your computer
Run the Script
macOS / Linux
chmod +x mobile-use.sh
./mobile-use.sh \
"Open Gmail, find the first three unread emails, and list their sender and subject line" \
--output-description "A JSON list of objects, each with 'sender' and 'subject' keys"
Windows (PowerShell)
powershell.exe -ExecutionPolicy Bypass -File mobile-use.ps1 `
"Open Gmail, find the first three unread emails, and list their sender and subject line" `
--output-description "A JSON list of objects, each with 'sender' and 'subject' keys"
The terminal may pause and ask whether Maestro can collect anonymous usage data. Type Y or n and press Enter.
Common Hiccups
| Error | Meaning | Quick Fix |
|---|---|---|
Could not get device IP |
Wi-Fi interface name is unusual | Run adb shell ip addr show up, find the interface, then add --interface <NAME> |
Failed to connect to <DEVICE_IP>:5555 |
Firewall blocked the port | Temporarily disable the firewall or open port 5555 |
unauthorized: authentication required (Docker) |
Old ghcr.io token | docker logout ghcr.io then rerun |
Route B: Manual Dev Setup (Full Control)
1. Clone
git clone https://github.com/minitap-ai/mobile-use.git
cd mobile-use
2. Environment Variables
cp .env.example .env
# Edit .env and add at least OPENAI_API_KEY
3. Virtual Environment
uv venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
uv sync
4. First Command
python ./src/mobile_use/main.py "Open Settings and tell me my current battery level"
7. Practical Walk-Throughs
Walk-Through 1: Battery Check
python ./src/mobile_use/main.py "Show me my battery percentage" \
--output-description "Plain text percentage only"
Sample output:
85%
Walk-Through 2: Email Export
Goal: create a nightly CSV of unread Gmail.
python ./src/mobile_use/main.py \
"Open Gmail, collect all unread emails, extract sender and subject" \
--output-description "CSV with columns sender,subject"
Walk-Through 3: Multi-App Workflow
Scenario: Daily report that pulls yesterday’s steps from Google Fit and sleep hours from Samsung Health.
python ./src/mobile_use/main.py \
"Open Google Fit, note yesterday’s steps; then open Samsung Health, note yesterday’s sleep hours; return JSON with keys steps,sleep_hours"
8. Swapping the AI Brain (LLM)
-
Copy template
cp llm-config.override.template.jsonc llm-config.override.jsonc -
Edit
llm-config.override.jsonc-
Change "provider":"openai"to"provider":"claude"or any other supported backend -
Add the new API key -
Save and exit—no restart required
-
9. Global FAQ
Q1: Does it work on a physical iPhone?
A: Not today. The README clearly states “Physical iOS devices are not yet supported.” iOS Simulator on macOS is the only option.
Q2: Is my data safe?
A: The entire codebase is MIT-licensed and open source. All processing happens on your machine unless you explicitly send data elsewhere.
Q3: How accurate is Chinese or non-English text?
A: As long as the app uses standard system fonts, recognition is high. Icons without text are handled through context reasoning.
Q4: Can I run it offline?
A: No. The agent needs to call the large language model over the internet unless you host a local LLM and point the config file to it.
Q5: Multiple commands in one go?
A: One sentence per run. For batch jobs, wrap the calls in a shell script or cron job.
10. Contributing in Three Steps
-
Open an Issue describing the bug or feature. -
Fork the repo and follow the guidelines in CONTRIBUTING.md. -
Submit a Pull Request—the maintainers review quickly.
11. Next Moves
By now you should be able to:
-
Explain what mobile-use does in one sentence -
Choose Docker or manual setup -
Run your first natural-language command -
Swap the underlying AI model -
Contribute back to the project
Pick one repetitive task you do on your phone every day, write it as a plain-English sentence, and let mobile-use handle it. You may never tap through the same sequence again.
References
-
Project repository: https://github.com/minitap-ai/mobile-use -
AndroidWorld benchmark leaderboard: Google Sheets

