4 min read

I Built a WebDriver for WKWebView Tauri Apps on macOS

TL;DR: Building a Tauri app on macOS? Tauri-WebDriver is an open-source WebDriver + MCP integration for E2E testing—so tools (and agents) can click/type/screenshot like they would in a browser. Built with Claude Code, follow the quick start to try it out.

I've been exploring Tauri lately – the framework for building desktop (and mobile) apps using web technologies with a Rust backend. If you haven't come across it, you: write your UI in React/Vue/Svelte/whatever, let Rust handle the backend, and ship a native binary that's a fraction of the size of an Electron app.

Why Tauri?

Honestly, I was just tired of Xcode build times and wanted the speed of web development. If you've ever maintained separate native codebases for desktop and mobile, you know the pain. Tauri lets you use the web stack you already know while still getting native-feeling performance and access to system APIs. The Rust backend gives you a memory-safe native backend (no bundled browser runtime) and the kind of mature tooling you want when dealing with lots of data or running AI workloads locally. You're not shipping a bundled Chromium – Tauri uses the platform's native webview (WKWebView on macOS, WebKitGTK on Linux, WebView2 on Windows), so binaries stay small and resource usage stays low.

It seems to be a popular choice right now. Apps like Codex Monitor (a desktop command center for orchestrating AI coding agents) and Conductor (parallel Claude Code and Codex agents on your Mac) are shipping with Tauri. The awesome-tauri list has many more across a variety of categories.

For my first project, I'm using Tauri v2 with Vite as the frontend build tool. Vite is a fast dev server and bundler – it uses native ES modules during development so you get near-instant hot module replacement instead of waiting for a full rebuild. It pairs well with Tauri since both prioritize speed.

The Testing Problem

I’ve gotten spoiled by automated testing. I’ve grown used to asking agents to spin up MCP servers so they can drive browser tooling (Chrome DevTools MCP–style) and native build/test stacks (XcodeBuildMCP-style) and validate flows for me. In web dev, you point Selenium or WebDriverIO at your app, click a few buttons, assert some text, take screenshots, and you’re done. I wanted that exact workflow for my Tauri app.

But on macOS there isn’t an Apple-provided WebDriver for embedded WKWebView apps—Apple’s WebDriver story is safaridriver for automating Safari itself, which doesn’t help when your UI is a WKWebView inside a desktop app. Linux has WebKitWebDriver, Windows has Edge WebDriver, and the official Tauri WebDriver stack supports those—but macOS requires a third-party driver or a custom bridge.

I wasn’t the only one frustrated by this. Around the same time I started building a solution, a couple of other projects popped up tackling the same problem (more on those below). But I’d already gone deep enough that it made sense to finish.

What I Built

Tauri-WebDriver is an open-source W3C WebDriver v1 implementation for Tauri apps on macOS, built in collaboration with Claude Code. It’s two Rust crates:

  1. A Tauri plugin that runs inside your app in debug builds. It starts a little HTTP server, injects a JavaScript bridge into the webview, and exposes endpoints for finding elements, clicking things, reading text, managing windows, executing scripts, etc.
  2. A CLI binary (tauri-wd) that speaks the standard W3C WebDriver protocol on port 4444. WebDriverIO, Selenium, or any W3C-compatible test runner connects to it like they would any browser driver. The CLI launches your app, discovers the plugin's port, and translates every WebDriver command into a plugin API call.
WebDriverIO/Selenium ──HTTP:4444──> tauri-wd CLI ──HTTP──> plugin inside your app

It’s not “100% of WebDriver,” but it covers a surprisingly large subset: element finding, input, screenshots, actions, windows, cookies, iframes/shadow DOM, alerts, print-to-PDF, and computed ARIA roles.

AI-Powered Testing with MCP

One thing I'm particularly excited about is the MCP (Model Context Protocol) integration. I connected tauri-webdriver to mcp-tauri-automation, an MCP server that lets AI agents like Claude Code directly launch, inspect, and interact with your Tauri app. You can say "click the submit button, fill in the form, and take a screenshot" and it just works.

I forked the original project to add some features I needed -- execute_script, get_page_title, get_page_url, multi-strategy element selectors, configurable screenshot timeouts, and wait_for_navigation – and submitted those upstream.

Alternatives

I'd be remiss not to mention the other solutions that exist:

  • CrabNebula Webdriver for Tauri – A commercial hosted testing service with macOS WebDriver support.
  • tauri-plugin-webdriver – An open-source Tauri plugin that embeds a WebDriver server directly in the plugin (single-crate vs. my two-crate approach). It supports (or plans to support) macOS, Linux, and Windows, so if you need cross-platform WebDriver support, this is probably the more pragmatic choice.

After starting this, I learned others were building similar things so this was admittedly not the greatest use of time. But, I learned more about Rust, Tauri’s plugin system, the W3C WebDriver spec, WKWebView quirks, and what’s happening under the hood when you capture screenshots in E2E tests.

Try It

If you're building a Tauri app on macOS and want automated e2e tests follow the quick start in the GitHub repo. The README also has additional information and a complete endpoint reference.

Disclosure: The code for this project was written in collaboration with Claude Code.