Vibium Architecture: AI-Native Browser Automation

This article is based on the official GitHub repository and validated ecosystem insights

Vibium is not just another browser automation framework. It’s an AI-native automation infrastructure designed for a future where human intent and autonomous agents work together—going far beyond what Selenium or Playwright were built for.

What is Vibium?

Vibium is an open-source project (Apache-2.0 licensed) whose stated purpose is:

“Browser automation infrastructure built for AI agents and humans.”

Unlike traditional frameworks that automate by manually writing scripts tied to selectors and step-by-step instructions, Vibium abstracts browser automation into a layered infrastructure optimized for natural language, agent integrations, and future AI-driven workflows.

What’s Inside the GitHub Repo?

The https://github.com/vibiumdev/vibium repository contains several key components, each serving a distinct purpose in the overall ecosystem.

1. clicker — The Core Binary

This is a small (~10 MB) Go-based executable that represents the heart of Vibium’s automation engine. Its responsibilities include:

Browser lifecycle management — starting, connecting to, and controlling a browser instance.
WebDriver BiDi proxy — exposing a WebSocket-based control layer that speaks with modern browsers using the WebDriver BiDi protocol.
MCP server — provides a standardized interface (stdio/WebSocket) so AI agents can interact with the browser directly.
Auto-waiting & screenshot capture — elements are automatically waited for, and screenshots can be captured in PNG format.

This binary is designed to be invisible to developers — meaning that tools like npm or agent integrations automatically install and run it without manual setup.

2. clients/javascript — JavaScript/TypeScript Client

This folder contains the developer-friendly API that JavaScript and TypeScript developers use in real code. It exposes:

Async API (Promise-based) — for modern async/await workflows.
Sync API — for synchronous usage where needed.

Developers can write code like:

import { browser } from "vibium";
const vibe = await browser.launch();
await vibe.go("https://example.com");
await vibe.find("button.submit").click();
await vibe.quit();

This layer wraps the Clicker binary and exposes real functions for navigating, finding elements, clicking, typing, screenshots, and quitting the browser.

3. docs and Supporting Files

README.md — high-level overview and basic usage instructions.
V1-ROADMAP.md / V2-ROADMAP.md — planned features like Python and Java clients, an AI-powered navigation layer (“Cortex”), extensions (“Retina”), and video recording.
WebDriver-Bidi-Spec.md — reference for the underlying protocol standard.

Architecture & Core Design

At its core, Vibium isn’t just another test library — it’s a multi-layer infrastructure, enabling automation from both human and AI perspectives.

Vibium architecture showing AI agents MCP protocol Clicker and WebDriver BiDi

1. Agent Layer (LLM / AI Tools)

This is the topmost layer. AI agents like Claude Code, GPT agents, or local models connect via the MCP protocol and issue high-level browser commands, e.g., “navigate to page, fill form, click button.”

2. MCP Protocol (Middleware Platform)

MCP acts as the communication bridge between the agent and the Clicker binary, typically via stdio or WebSocket, enabling consistent automation semantics no matter the client.

3. Clicker Binary

As discussed, this is the execution engine. It:

Spawns Chrome with WebDriver BiDi enabled.
Acts as a proxy between higher-level calls and the browser.
Manages element detection and auto-waits for dynamic content.

4. Browser via WebDriver BiDi

Vibium uses the modern WebDriver BiDi protocol, a bidirectional WebSocket-based communication method that solves many of the performance and flakiness issues of legacy automation.

This means:

Faster interactions than HTTP REST-based automation.
Real-time events flowing in both directions.
Better support for modern browser features.

Key Features & UX

While Vibium is still early in its evolution, based on the GitHub repo and project documentation, these are core capabilities:

1. Simple Developer API

Whether human developers or AI agents control the browser, Vibium provides easy APIs — no complex setup, cross-platform support (Linux, macOS, Windows), and installation via npm.

2. Agent-First with MCP Support

Unlike normal automation frameworks, Vibium is built from the ground up to support AI agent orchestration, making it ideal for AI-built workflows and “voice-driven” or natural-language-driven automation integration.

3. Unified Architecture for Humans & Machines

The same binary and protocol stack serves both traditional devs and autonomous agents, reducing fragmentation in test automation infrastructure.

Current Status (as of Dec 2025)

Here’s where Vibium stands in real terms:

✅ Repository and code are live on GitHub, with an active commit history and roadmaps.
✅ Developer APIs (JS/TS) and the core Clicker binary exist.
❗ No official release packages yet — as of the latest state, there are no GitHub releases published.
❗ The Python client and broader ecosystem (e.g., UI test authoring tools) are still planned.

Industry Context & Vision

Vibium isn’t just code — it’s a vision for where automation is headed:

AI-Native Automation

The project tries to shift perspective:

From “how do I click that button?”
to
“did the user accomplish the task?”.

This aligns with broader industry demand for intent-based testing and drastically easier automation authoring.

“Vibe Coding” Concept

The idea of describing automation by plain language intent (e.g., “Submit the login form”) rather than detailed selectors aims to reduce brittle tests and promote collaboration between technical and non-technical team members.

Future Possibilities

Long-term discussions in the ecosystem include:

Decentralized testing networks — executing tests across a global grid of real devices.
Self-healing capabilities that adapt to UI changes.

These goals are aspirational and future-looking rather than fully shipped right now.

How to Get Started Today with Vibium?

If you want to explore Vibium as early adopters or experiment with it:

For JavaScript/TypeScript Developers

npm install vibium

Use the APIs demonstrated in the repo:

import { browser } from "vibium";
const vibe = await browser.launch();
await vibe.go("https://example.com");
await vibe.quit();

Vibium simplifies browser management — it automatically downloads Chrome for Testing and relevant drivers under the hood.

For AI Agents

Agents like Claude Code can be equipped with:

claude mcp add vibium -- npx -y vibium

This instructs the agent to use Vibium for browser automation without manual install steps.

Final Thoughts

Vibium’s story is compelling — a modern automation framework built for the AI era, guided by a philosophy that prioritizes agent integration, developer ergonomics, and bidirectional communication. It brings together years of automation expertise with a fresh architecture centered around WebDriver BiDi.

However, it’s still early — the project’s roadmap shows exciting plans, but many features remain in future milestones. As with any evolving open-source ecosystem, adoption will depend on how quickly those plans materialize and how the developer and testing communities embrace Vibium’s vision.

Still, for engineers and testers curious about where automation meets AI, Vibium is one of the most intriguing projects to watch in 2025–2026.

🔥 Level Up Your SDET Skills 🔥

Monthly Drop : Real-world automation • Advanced interview strategies • Members-only resources

1 COMMENT

Vibium: Browser automation for AI agents and human developers - LT Tech Blog 25 December 2025 At 2:12 AM

Vibium Architecture Explained: AI-Native Browser Automation

Vibium Architecture: AI-Native Browser Automation Deep Dive