Documentation

Overview

Proberun is a local-first MCP server that lets AI coding agents — Claude Code, Cursor, Codex — drive your iOS Simulator and Android emulator the way Playwright drives a browser. You describe a test in plain language; the agent reads an indexed accessibility tree, calls the right tools, waits between screens, and reports what broke.

Everything runs on your machine. Your source, screenshots, and traces never leave it. The local tier is free and open source; a hosted tier (cloud runs on our simulators and real devices) is opt-in and metered per minute.

New here? Jump to Install then Quickstart— you'll have an AI testing your app in about five minutes.

Requirements

WhatWhyNotes
macOS 14+iOS Simulator + idb require itApple Silicon or Intel
Xcode + Command Line ToolsBuilds your app, runs the Simulatorxcode-select --install
Node 20+Runs the MCP serverany LTS or current
Python 3.12fb-idb (the iOS bridge) needs ≤3.12not 3.13/3.14 yet
Android SDK (optional)Android emulator supportadb + emulator auto-detected

Android is optional — if you only test iOS you can skip the SDK. The server auto-detects both toolchains and exposes whichever is present.

Install

Three system tools, then the server, then register it with your AI editor.

terminal
# 1. iOS automation bridge
$ brew install facebook/fb/idb-companion
$ pipx install fb-idb --python /opt/homebrew/bin/python3.12
# 2. ffmpeg (for video recording / tracecast)
$ brew install ffmpeg
# 3. the Proberun MCP server
$ npm i -g proberun-cli
# 4. register with Claude Code (or Cursor/Codex)
$ claude mcp add proberun -- proberun
✓ Added stdio MCP server proberun (68 tools)

Verify the toolchain any time with proberun doctor — it checks xcrun, simctl, idb, idb_companion, ffmpeg, and reports booted simulators and data dirs.

On Android, set ANDROID_HOME (or install to the default~/Library/Android/sdk) and the android_* tools light up automatically.

Quickstart — your first test

Open your AI editor with Proberun connected and describe the test. For example:

Claude Code
you ▸ Use proberun. Boot iPhone 17, build my app at ./MyApp.xcodeproj scheme MyApp, install and launch it, then walk the signup flow and confirm a bad email shows an error.

Behind the scenes the agent calls these tools in sequence:

proberun
$ boot_simulator "iPhone 17"
✓ Booted iPhone 17
$ build_app ./MyApp.xcodeproj MyApp
✓ Built MyApp.app (42s)
$ install_app && launch_app com.you.MyApp
$ ui_snapshot
[3] Button "Sign Up" [7] TextField "Email" value=""
$ tap_index 3 → type_text → tap_text "Continue"
$ wait_for_text "Enter a valid email"
✓ Found in 0.9s — validation works

No selectors, no test code, no maintenance. The agent reasons over the live UI tree and adapts when the screen changes.

Core concepts

Indexed snapshot

ui_snapshot returns a compact, numbered accessibility tree — [3] Button "Sign In" frame=(32,400,320,44). Indices are stable until the next snapshot; the agent taps by index or by text, never by raw pixels. Cheap in tokens, robust to layout shifts.

App Atlas

atlas_build autonomously walks your whole app, fingerprinting each screen and recording transitions into a graph stored at ~/.proberun/atlas/<app>.json. Tests then navigate by name (atlas_path_to "SettingsScreen") instead of re-discovering the UI — roughly 80% fewer tokens per test, and the key to cheap cloud runs (see below).

State snapshot / restore

save_state clones the simulator (via simctl clone) so you can restore_state a logged-in starting point in 2–5s instead of a 30–60s cold boot + login. Playwright contexts, for iOS.

Backend observability

start_log_capture and start_network_capturerecord the app's logs and full HTTPS traffic during a test, auto-classifying Firebase / Supabase / Stripe / Sentry errors and HTTP 4xx/5xx. When a flow breaks, the agent knows whether it was a 401 from your auth backend or a UI bug — not just "the button didn't work."

Vision fallback

When the accessibility tree is sparse (React Native, Flutter, Unity, canvas), vision_ocr (Apple Vision, free + local) or vision_describe_llm (bring-your-own Anthropic key) reads the screen so those apps still work.

Tool reference

68 tools. Pass an optional udid/serial to any; omit it to reuse the last/only device.

Lifecycle (iOS)

list_simulatorsList sims with UDID, name, state, runtime
boot_simulatorBoot by name or UDID; opens Simulator.app
build_appxcodebuild an .xcodeproj/.xcworkspace for the sim
install_app / launch_appInstall a .app, launch by bundle id
terminate_app / uninstall_appKill or remove an app
open_urlTrigger a deep link
reset_simulatorErase content & settings (fresh context)
list_installed_appsBundle ids + display names on the sim

Perception

ui_snapshotIndexed accessibility tree; vision_fallback option
screenshotPNG, returned inline for vision models
vision_ocrApple Vision OCR — text + bboxes, free & local
vision_describe_llmVision-LLM screen description (BYOK Anthropic)

Action

tap / tap_index / tap_textTap by coords, snapshot index, or fuzzy text
long_pressPress-and-hold at coords
swipeCoords or direction (up/down/left/right)
type_textType into the focused field
press_buttonHOME / LOCK / SIRI / etc.

Wait & sync

wait_for_textBlock until text appears
wait_for_text_disappearBlock until text is gone
wait_for_snapshot_stableBlock until the UI settles
wait_for_index_changeConfirm a tap changed the screen

App Atlas

atlas_buildAutonomously map the whole app into a graph
atlas_path_toShortest action path between two screens
atlas_which_screenIdentify the current screen by fingerprint
atlas_record_screen / atlas_record_transitionManual graph building
atlas_get_screen / atlas_list_screensRead recorded screens

State

save_state / restore_stateClone & restore sim state (logged-in, etc.)
list_states / delete_stateManage saved states

Backend observability

start_log_capture / stop_log_captureCapture + classify app logs
get_log_entries / wait_for_log_entryQuery / block on classified log events
start_network_capture / stop_network_capturemitmproxy HTTPS capture
get_network_flows / inspect_network_flowList + inspect full request/response
wait_for_network_requestBlock until a backend call fires

Recording & trace

start_recording / stop_recordingRecord an .mp4 of the run
log_thoughtNarrate reasoning into the trace (for tracecast)
report_issueSend in-tool feedback to the maintainers

Cloud cost-savers

estimate_cloud_runPredict cloud minutes + cost before you run
export_cloud_contextBundle local atlas so the cloud skips discovery

Android

android_list_devices / android_boot_emulatorDevices/AVDs; boot an emulator
android_install_app / android_launch_appInstall APK, launch a package
android_ui_snapshotuiautomator dump → indexed tree
android_tap / android_tap_index / android_tap_textTap by coords / index / text
android_swipe / android_type_text / android_press_keyGestures, text, keys
android_screenshotPNG of the device

Cloud runs & saving money

Local runs are free forever. When you need parallel runs in CI, add --cloud and your tests execute on our hosted simulators (and real devices on Business). You pay only for the minutes you run, and Proberun is built to make that bill small.

Do the expensive work locally, for free

Exploration — mapping your app — is the slow, expensive part. Run it once locally (free), then upload the result so the cloud skips it:

lower your cloud bill
$ atlas_build com.you.MyApp # free, on your Mac
$ estimate_cloud_run com.you.MyApp --flows 5
✓ Atlas present (24 screens) — cloud skips exploration
estimated: 6 min → ~$0.90 (saved ~9 min / ~$1.35)
$ export_cloud_context com.you.MyApp
↘ ~/.proberun/cloud/com.you.MyApp/ — upload with your app

estimate_cloud_run shows the cost before you spend a metered minute. export_cloud_contextbundles your atlas so the runner loads the map instead of re-discovering it. Most vendors bill you for that discovery every run — we don't.

TierFlatIncludedOverage
Pro$29/mo200 sim-min$0.15/min
Team$99/mo · 5 seats1,000 sim-min$0.12/min
Business$499/mo300 device-min$0.40/min

CI integration

Run flows on every PR. The same flow specs you author locally run in CI; results come back as a trace + screenshot diff. (GitHub Action ships with the hosted tier.)

.github/workflows/proberun.yml
- name: Proberun
run: proberun run --cloud --flow flows/*.json
env: { PROBERUN_TOKEN: ${{ secrets.PROBERUN_TOKEN }} }

Locally, the included eval harness (eval/runner.ts) runs flow specs against the MCP server and writes a pass/fail leaderboard — wire it into any CI that has macOS runners.

Troubleshooting

SymptomFix
“idb not found”Re-run pipx install fb-idb --python python3.12 (must be ≤3.12)
“No simulator booted”Open Simulator.app and boot one, or call boot_simulator
ui_snapshot returns few elementsApp is RN/Flutter/canvas — pass vision_fallback=true or use vision_ocr
long_press no-ops on 2nd callFixed in v0.1.1 (settle delay) — update if older
Recording file is 0 bytesKeep the sim booted for the whole recording; don't reset mid-record
Android tools missingSet ANDROID_HOME or install SDK to ~/Library/Android/sdk

FAQ

Real devices?

Sim + emulator today; real-device runs are the Business tier (hosted). Local is Simulator-only.

React Native / Flutter / Unity?

Native views work directly; sparse trees fall back to Apple Vision OCR (free, local) or a vision LLM.

Which assistants?

Anything that speaks MCP — Claude Code, Cursor, Codex. Tested primarily on Claude Code.

Is my code uploaded?

No. The local CLI is local-only. Telemetry is opt-in and anonymous — tool names + error counts, never code or arguments.

License?

The local tool is Apache 2.0. The hosted orchestration is proprietary.

Want the hosted tier + early access?

Request early access →