Founding Crew open · open-sourcing at public launch · iOS + Android

Playwright for mobile apps.
Claude tests them while you build.

A local MCP server that gives Claude Code, Cursor, and Codex real hands on your iOS Simulator and Android emulator — drive the UI, map the whole app, snapshot state, and watch backend errors live. 66 tools. Free and open source.

Join the Founding Crew · $49 →Request early access

First 25 only · early access now + 50% off hosted for life · refund anytime

proberun start_recording → real Claude session testing a shipping iOS app

Recorded by proberun's own start_recording tool · dual-screen edit by proberun-tracecast · both ship in the repo

How it works

One prompt. Claude drives the rest.

You describe the test in plain language. Claude reads an indexed accessibility tree — not pixels — and calls the right tools in sequence. It waits between screens, logs its reasoning, and stops the moment something regresses.

What you type

Claude Code

you ▸ Use proberun. Boot iPhone 17, build my app, walk the signup flow, and make sure a bad email actually surfaces an error.

Why it's reliable

→ Indexed a11y tree, so taps target elements, not coordinates
→ Auto re-snapshots and fuzzy-matches when the UI shifts
→ Vision OCR fallback for canvas, React Native, Flutter, games

What runs on your Mac

proberun

$ boot_simulator "iPhone 17"

✓ Booted iPhone 17

$ build_app ./MyApp.xcodeproj MyApp

✓ Built MyApp.app (42s)

$ launch_app com.you.MyApp

$ ui_snapshot

[3] Button "Sign Up" frame=(32,400,320,44)

[7] TextField "Email" value=""

$ tap_index 3

✓ Tapped Button "Sign Up"

$ wait_for_text "Enter a valid email"

✓ Found in 0.9s — validation works

Why it's different

Snapshots beat cold boots. Observability beats guessing.

Most mobile test tools re-discover the UI every run and treat the backend as a black box. Proberun is built for speed, debuggability, and AI agents.

App Atlas

no competitor has this

Proberun autonomously walks your whole app on every build — mapping each screen and transition into a graph. Tests navigate by name instead of re-discovering the UI, cutting tokens ~80%.

State snapshots

10–20× faster CI

save_state / restore_state via simctl clone. Restore a logged-in state in 2–5s instead of a 30–60s cold boot + re-login. Playwright contexts, for iOS.

Backend observability

others are black boxes

Capture the app's logs and full HTTPS traffic during a test. Auto-classify Firebase, Supabase, Stripe, and Sentry errors. See the whole causality: tap → request → 401 → broken screen.

Vision fallback

local + BYOK

When the accessibility tree is sparse, Apple Vision OCR (free, local) or a vision LLM reads the screen — so RN, Flutter, Unity, and canvas UIs still work.

Trace replay

built in

Every run records reasoning, tool calls, screenshots, and a video. proberun-tracecast renders a dual-screen replay you can scrub — the demo on this page was generated by it.

MCP-native

AI-first

No HTTP adapter, no CLI wrapper. Claude Code, Cursor, and Codex call the tools directly with full context. The AI is the test runner.

The unfair advantage

See the backend, not just the screen.

Proberun captures the app's logs and network traffic during a test and classifies the errors. When a flow breaks, Claude knows whether it was a Firebase 401, a Stripe decline, or a UI bug — without you digging through Xcode.

firebasesupabasestripesentryhttp_4xxhttp_5xxauth_errortimeout

catching an auth failure mid-test

$ start_log_capture process_name="Quill"

$ tap_text "Sign In"

$ wait_for_log_entry tags=[firebase_error,auth_error]

✓ 12:55:41 Quill [firebase,firebase_error,auth_error]

FIRAuth: signIn failed — 401 INVALID_LOGIN_CREDENTIALS

$ get_network_flows backend="firebase" status_min=400

[0] 12:55:41 POST identitytoolkit … → 401 230ms [firebase]

$ inspect_network_flow 0

{ "error": { "message": "INVALID_LOGIN_CREDENTIALS" } }

The surface

66 tools. One MCP server.

Identical tool shapes across iOS today and Android now. Your AI editor calls them directly.

Lifecycle

· boot_simulator
· build_app
· install_app
· launch_app
· open_url
· reset_simulator
· +5 more

Perception

· ui_snapshot
· screenshot
· vision_ocr
· vision_describe_llm

Action

· tap
· tap_index
· tap_text
· swipe
· type_text
· long_press
· press_button

Wait & sync

· wait_for_text
· wait_for_snapshot_stable
· wait_for_index_change
· wait_for_text_disappear

App Atlas

· atlas_build
· atlas_path_to
· atlas_which_screen
· atlas_record_screen
· atlas_get_screen
· +1 more

State

· save_state
· restore_state
· list_states
· delete_state

Backend observability

· start_log_capture
· get_log_entries
· wait_for_log_entry
· start_network_capture
· get_network_flows
· inspect_network_flow
· +4 more

Recording & trace

· start_recording
· stop_recording
· log_thought
· report_issue
· recording_status

Android

· android_boot_emulator
· android_ui_snapshot
· android_tap_text
· android_install_app
· +10 more

Works with your stack

MCP-native. Plugs into the tools you already use.

Claude CodeCursorCodexGitHub ActionsXcodeAndroid Studioidbmitmproxy

One server. iOS Simulator + Android emulator, same tool names. Read the docs →

Run on our cloud

Don't have a Mac farm? Run on ours.

Local is free forever. When you need parallel runs in CI, point Proberun at our cloud — your tests execute on hosted simulators and real devices, results stream back with full traces. You only pay for the minutes you run.

→ Per-minute pricing — no idle cost, no seats you don't use
→ Autoscales: a runner spins up per run, tears down when done
→ Real iOS & Android devices on the Business tier
→ Every run returns a scrubbable trace + video

proberun run --cloud

$ proberun run --cloud --flow signup.json

↗ uploading MyApp.app …

✓ queued · runner spinning up (macOS)

▸ boot_simulator → install → launch → 14 steps

✓ passed in 1m48s · 2 min billed

↘ trace: proberun.com/runs/8f2a…/replay

Honest comparison

Where we lead, and where we don't — yet.

We won't pretend to beat mature tools on real-device fleets today. We're open about what ships now and what's coming.

Capability	Proberun	mobai.run	Maestro	Appium
Free local OSS	✓	—	✓	✓
MCP-native (Claude/Cursor)	✓	✓	—	—
App Atlas auto-map	✓	—	—	—
State snapshot / restore	✓	—	—	—
Backend log + network capture	✓	—	—	—
Vision OCR fallback	✓	✓	—	—
Trace replay video	✓	—	—	—
iOS support	✓	✓	✓	✓
Android support	✓	✓	✓	✓
Real devices	soon	✓	✓	✓
Entry price	$0	$5–10/mo	$0 / $250 cloud	$0 + infra

Spot something unfair? Open an issue — we'll fix the table.

Roadmap

Built in the open, shipping every week.

What's done, what's next, and what we're exploring. The changelog lives in the repo; the direction lives here.

Shipped

·66+ tools, iOS + Android
·App Atlas auto-mapping
·State snapshot / restore
·Backend log + network capture
·Vision OCR fallback
·Trace replay video
·Local cloud-cost estimator

·Hosted cloud runs (Pro)
·Trace viewer in the browser
·GitHub Action + PR diffs
·Flow dry-run / lint
·Real-device farm (Business)

Exploring

·Self-healing flow repair
·Test generation from user stories
·Cross-platform flow reuse
·Tauri / Electron via Playwright

Pricing

Local is free forever. Hosted is when you scale.

The local tier is never gated. Every moat — Atlas, snapshots, observability, vision — runs free on your machine. Paid is for parallel cloud runs, history, and the team layer.

Founding Crew

first 25

$49 once. Early access to the local tool now, 50% off hosted for life, a founder badge, and a direct line to shape the roadmap. Refund anytime before hosted ships.

Back it · $49 →

Local

$0forever

✓All 66 tools
✓iOS + Android
✓App Atlas + snapshots
✓Backend observability
✓Vision OCR + tracecast
✓Apache 2.0

Get install →

Reserve

Pro Beta

$29/mo

✓Hosted parallel runs
✓Trace viewer + history
✓Screenshot diffs in PRs
✓Slack alerts
✓Delivery ≤ 30 days
✓Refund anytime

Reserve Pro Beta →

Team

$99/mo · 5 seats

✓Everything in Pro
✓GitHub PR comments
✓Shared traces
✓Org RBAC
✓Priority issues
✓5 seats included

Reserve Team →

Business ($499/mo · real-device farm) and Enterprise — hello@proberun.com

5 minutes to your first AI test

terminal

# macOS · Xcode CLT · Node 20+

$ brew install facebook/fb/idb-companion

$ pipx install fb-idb --python python3.12

$ npm i -g proberun-cli

$ claude mcp add proberun -- proberun

✓ Added MCP server proberun (66 tools)

FAQ

Direct answers.

Is this another stealth ad for a paid side project?+

No. Built in the open by @DaltonTheDeveloper. Apache 2.0 from day one; the roadmap, decisions, and rough edges are all public. The local tier is genuinely free forever.

Does it work with real devices today?+

Sim + emulator today. Real-device support (signed WebDriverAgent) is on the roadmap for the Business tier. We won't pretend otherwise.

React Native / Flutter / Unity / canvas UIs?+

Native SwiftUI/UIKit and Android views work great. Sparse trees fall back to Apple Vision OCR (free, local) or a vision LLM — so canvas-rendered apps still work.

What assistants does it support?+

Anything that speaks MCP: Claude Code, Cursor, Codex. Tested primarily on Claude Code.

Is my code sent anywhere?+

No. The local CLI runs entirely on your machine. Telemetry is opt-in and anonymous — tool names and error counts only, never code or arguments.

How is this different from mobai.run?+

Same tool surface, but open source and free local, plus App Atlas auto-mapping, state snapshots, and full backend observability — none of which they have.

Give Claude its first mobile shift.

Free forever locally. Five minutes to install. Reserve the hosted tier if you want CI, parallel runs, and replays.

Install free local →Join the Discord

Playwright for mobile apps.Claude tests them while you build.

One prompt. Claude drives the rest.

Snapshots beat cold boots. Observability beats guessing.

App Atlas

State snapshots

Backend observability

Vision fallback

Trace replay

MCP-native

See the backend, not just the screen.

66 tools. One MCP server.

Lifecycle

Perception

Action

Wait & sync

App Atlas

State

Backend observability

Recording & trace

Android

MCP-native. Plugs into the tools you already use.

Don't have a Mac farm? Run on ours.

Where we lead, and where we don't — yet.

Built in the open, shipping every week.

Local is free forever. Hosted is when you scale.

Founding Crew

Local

Pro Beta

Team

Direct answers.

Give Claude its first mobile shift.

Playwright for mobile apps.
Claude tests them while you build.