Stories by Boni García on Medium

Implementing an MCP Server in Java

Boni García — Sat, 18 Apr 2026 10:01:23 GMT

This story summarizes a talk by Boni García at Java Day Istanbul, Turkey, on April 18, 2026. You can find the original slides of this talk here.

What is MCP?

Model Context Protocol (MCP) is an open standard for connecting AI agents to external services. It was created by Anthropic in 2024 and is now part of the Linux Foundation’s Agentic AI Foundation. In practice, MCP defines a common contract between an MCP client and one or more MCP servers, enabling models to interact with tools and external data sources consistently rather than relying on one-off integrations.

A useful way to think about MCP is as a boundary layer between the agent and the outside world. The agent handles prompts, memory, reasoning, and responses. The MCP client sits on the agent side. MCP servers sit on the system side and expose capabilities such as tools, resources, and prompts over a transport. That separation is valuable because it lets us evolve the implementation behind a server without changing the contract the agent sees.

MCP servers

An MCP server is simply an adapter that exposes selected capabilities from an external system. That system might be a filesystem, a source control platform, a container platform, a documentation engine, or an internal enterprise API. The server translates between MCP concepts and the underlying implementation details.

There is already a healthy ecosystem of MCP servers. Some examples are the Filesystem MCP Server, the GitHub MCP Server, the Docker MCP Server, or Context7. Those examples matter because they show the breadth of the model: MCP is not limited to one category of automation. It can encompass local tools, SaaS APIs, developer platforms, and knowledge systems under a single protocol umbrella.

Implementing an MCP server in Java

For Java developers, that makes MCP especially interesting. Java already has mature ecosystems for HTTP services, security, observability, validation, browser automation, enterprise integration, and cloud deployment. So the real question is not whether Java can implement MCP servers. The interesting question is which Java style gives us the best trade-offs for a given team and deployment model.

There are four main alternatives to build an MCP server with Java:

They all target the same protocol, but they optimize for very different priorities.

The official SDK provides the most direct access to the protocol model and transport layer. Spring AI provides the most familiar enterprise developer experience for teams that already live in Spring Boot. Quarkus offers the strongest documented story for performance, native deployment, observability, and build-time validation. Micronaut sits in an interesting middle ground, building on top of the official SDK while adding Micronaut-style dependency injection and compile-time schema generation.

One important disclaimer before we start: this is an architecture and documentation comparison, not a benchmark.

1. The official MCP Java SDK

The official MCP Java SDK is the reference option. It exposes the protocol model directly, including tools, resources, prompts, completions, sampling, elicitation, progress, and logging. It supports synchronous and asynchronous programming models, and the core module ships with built-in STDIO, SSE, and Streamable HTTP transports without requiring an external web framework.

From an API design perspective, this option is explicit. You build McpSyncServer or McpAsyncServer, declare capabilities, define tools with Tool.builder(), provide JSON Schema, and wire handlers manually. Resources, prompts, subscriptions, and notifications follow the same pattern.

Strengths

The SDK is ideal when you want protocol-level control. It is also the cleanest choice if you want to embed MCP into an existing Java application without adopting a heavier framework to expose a few tools. Its concrete strengths are:

Direct access to sync and async server APIs
Built-in support for stateful and stateless server patterns
Lightweight STDIO transport with non-blocking message processing
Pluggable JSON serialization
Pluggable authorization hooks
DNS rebinding protection via Host and Origin validation

The SDK docs give you the primitives, but they do not offer the same batteries-included experience you get with a framework integration. You are the one deciding how to package the server, how to integrate authentication, how to structure dependency injection, how to wire validation, how to expose metrics, and how to test beyond the protocol layer.

Security

It has low-level security hooks. The documentation explicitly mentions pluggable authorization hooks and DNS rebinding protection. Streamable HTTP transports also support security validation. If you need OAuth2 resource server integration, role-based policies, token validation, or method-level authorization, you will either integrate these concerns yourself or use the surrounding platform to handle them.

Performance

The SDK docs describe the STDIO transport as lightweight and non-blocking. That is a useful signal, but the official documentation does not position the SDK as a performance-tuned application platform in the same way Quarkus does.

Best fit

Choose the official MCP Java SDK when:

You want maximum control over MCP internals.
You want the leanest abstraction layer.
You do not want a large framework dependency.

2. Spring AI

Spring AI wraps the official MCP Java SDK in the Spring Boot way: starters, auto-configuration, annotation scanning, and a declarative programming model.

Spring AI offers dedicated MCP Boot starters for STDIO, WebMVC, and WebFlux, plus support for SSE, Streamable HTTP, and stateless Streamable HTTP depending on the selected starter and configuration. On top of that, the MCP annotations module adds @McpTool, @McpResource, @McpPrompt, and @McpComplete, with automatic JSON schema generation and request-context injection.

Strengths

Spring AI’s biggest strength is developer productivity inside the Spring ecosystem:

Boot starters for transport selection and auto-configuration
Annotation-based registration of tools, resources, prompts, and completions
Automatic bean scanning and registration
Support for sync and async modes
Support for stateful and stateless server modes
Access to Spring’s existing ecosystem for configuration, packaging, deployment, security, and operations

Security

There is a dedicated MCP Security module that adds OAuth2 resource server support, API key authentication, an MCP-oriented authorization server, and fine-grained access control for tools and resources. The docs also show the direct use of Spring Security’s @PreAuthorize method, and tool methods can access the current authentication through SecurityContextHolder. However, the current documentation also contains important caveats:

The security module is marked work in progress.
It is community-driven rather than officially endorsed by Spring AI or the MCP project.
It is compatible only with Spring WebMVC-based servers.
This security module does not support deprecated SSE.
This security module does not support webFlux-based servers.
Opaque tokens are not supported, and the docs recommend JWT.

Performance

The Spring MCP docs emphasize flexibility, starter-based integration, multiple transports, and declarative annotations. They do not position MCP support around low-memory or instant-startup claims.

That means performance in Spring AI MCP is less a property of the MCP layer and more a property of how you run Spring Boot. In practice, that usually means a heavier runtime footprint than the raw SDK and a less aggressively optimized story than Quarkus native.

Best fit

Choose the Spring AI to implement an MCP server when:

Your team already builds Java backends with Spring Boot; this is the shortest path from “we want an MCP server” to working code.
You want the shortest path from annotated Java methods to an MCP server.
You want strong alignment with Spring Security.

3. Quarkus

Quarkus MCP Server is the most opinionated and operationally ambitious option of the four. The documentation is strong. It does not just explain how to expose tools. It also covers performance, build-time validation, multiple isolated servers, security, observability, testing, and guardrails.

Strengths

Quarkus is designed around build-time analysis. The MCP extension follows that philosophy. The docs state that tools, resources, and prompts are discovered and validated at build time, with zero reflection, and that native executables can start in milliseconds and run on around 30 MB of RAM. Concrete strengths appointed by its documentation are:

Multiple transports including STDIO, HTTP, SSE, and WebSocket
Full CDI integration
Reactive execution on Vert.x
Support for virtual threads
Multiple isolated MCP servers in a single application
Different endpoints, feature sets, and security policies per server
Micrometer-based metrics integration
A dedicated testing library called McpAssured
Validation integration through Hibernate Validator
Tool guardrails that run before invocation
OIDC and OAuth2 protected resource metadata support for MCP authorization workflows

Observability is documented through Micrometer integration with metrics for active connections, request durations, and success and failure rates. Testing is documented through McpAssured, which supports SSE, Streamable HTTP, WebSocket, and STDIO. Input validation can be pushed into Jakarta Bean Validation and reflected into the tool schema generation. Guardrails can reject or transform tool inputs before they are invoked.

Security

Instead of bolting on a separate MCP security library, Quarkus provides a platform-level approach by integrating MCP security concerns into the same security infrastructure you already use for other Quarkus services.

HTTP transports can be secured through the Quarkus web security layer. Annotated methods can also use @Authenticated and @RolesAllowed. The docs strongly recommend restricting CORS to trusted origins for Streamable HTTP endpoints. For OIDC integration, the docs show support for audience validation and protected resource metadata discovery to align with the MCP authorization specification.

The one caveat documented explicitly is that, when authentication fails, method-level annotation security returns an MCP protocol error code rather than an HTTP status code.

Performance

Quarkus is the only one of the four whose MCP documentation makes strong, explicit runtime claims:

Native executables start in milliseconds
Production servers run on about 30 MB of RAM
The build-time model eliminates reflection
Vert.x provides non-blocking I/O
Virtual threads are also supported

Best fit

Choose Quarkus MCP Server when:

Performance, startup time, container density, and cold-start behavior are first-class requirements.
You want native-image-friendly deployment.
You need strong observability and testing out of the box.

4. Micronaut

Micronaut MCP is rooted in the official MCP Java SDK, as the guide explicitly states. It adds Micronaut-style configuration, dependency injection, transport-aware server creation, annotation-driven tool definitions, and compile-time JSON Schema generation using Micronaut JSON Schema.

At the time of this writing, the current doc is labeled version 0.0.20, suggesting this integration is earlier in its lifecycle than the other three.

Strengths

Micronaut MCP provides a higher-level programming model than the raw SDK, without moving as far into framework abstraction as Spring AI:

Configuration-driven transport selection for STDIO and HTTP
Sync and reactive modes
SDK server instance selection based on transport and reactive mode
@Tool, @Prompt, @Resource, and template annotations
Micronaut-specific transport context with access to authenticated user, locale, and host
Compile-time generation of tool input and output JSON Schema
Factory-based definitions as an alternative to annotations
Helper support for SearchTool and FetchTool implementations.

Security

Micronaut MCP is the least explicit of the four on security in its current guide. The guide exposes MicronautMcpTransportContext, which provides access to the authenticated user and related request information. But unlike Spring and Quarkus, the MCP guide currently lacks a dedicated security section with concrete authentication and authorization patterns for MCP endpoints.

Performance

Micronaut, as a framework, is generally associated with low startup overhead and strong compile-time processing. Still, the Micronaut MCP guide itself does not make the same explicit runtime claims as Quarkus does.

What the guide does document is compile-time schema generation and a design closely tied to the MCP Java SDK. That suggests a reasonably efficient stack, but not enough to claim superiority without measurement.

Best fit

Choose the Micronaut for MCP when:

Your team already prefers Micronaut.
You want a lighter framework style than Spring.
You like compile-time schema generation and explicit SDK alignment.
You are comfortable accepting a younger MCP integration with some current feature gaps.

Head-to-head comparison

Reducing the comparison to one sentence per option, the picture looks like this: the official Java SDK is the protocol-first choice, Spring AI is the enterprise-default choice, Quarkus is the production-optimization choice, and Micronaut is the lightweight and promising choice.

From an abstraction perspective, the official SDK sits closest to the protocol. You define capabilities explicitly, wire handlers yourself, and stay very close to the transport and schema models. Spring AI and Quarkus move higher up the stack with annotation-driven programming models and framework integration. Micronaut also raises the abstraction level, but in a slightly more restrained way, staying closely aligned with the official SDK while adding dependency injection and compile-time schema generation.

From an operational perspective, the differences become even clearer. Spring AI benefits from the broader Spring Boot ecosystem and is probably the safest recommendation for an average enterprise Java team that already runs Spring services in production. Quarkus stands out for its documented focus on startup time, native-image friendliness, observability, validation, and testing. Micronaut looks attractive to teams that want a lighter framework and strong compile-time processing, but its MCP integration still seems newer than the other options. The raw SDK remains the most direct and portable implementation, but it also asks more from you in terms of packaging, security integration, and platform concerns.

Case study: basic Selenium MCP server

For the case study, I wanted something concrete, easy to understand, and representative of real tool integration. Selenium was a natural fit, as it is a browser automation library that provides a clear mapping between ordinary Java logic and MCP-exposed tools.

This example was implemented as a part of my book Context engineering: the art and science of shaping context-aware AI systems, to be published by Manning in 2026. Find the complete source code in the open-source companion GitHub repository for this book.

The Selenium server in this talk exposes four simple tools: open_browser, navigate_url, get_browser_text, and close_browser. That is intentionally small, but it is enough to demonstrate the whole pattern. An AI agent can request a browser session, navigate to a page, extract the visible text, and close the session. In other words, we take a familiar browser automation workflow and wrap it in an MCP contract that any compatible client can call.

The design lesson is more important than the four tools themselves: keep the business logic separate from the MCP adapter. That separation appears as a simple layered model. At the top, there is the AI agent. Then comes the tool contract. Then the MCP adapter layer, implemented with the Java SDK, Spring, Quarkus, or Micronaut. Finally, the browser manager contains the Selenium code. Once you structure the application that way, the core automation logic stays stable while the framework-facing adapter becomes replaceable.

Across the four implementations, the user-facing tool surface and the Selenium logic are essentially the same. What changes are the amount of boilerplate, the annotation model, the lifecycle integration, and the surrounding platform capabilities? The Java SDK version is explicit and low-level. The Spring version is concise and familiar for Spring Boot teams. The Quarkus version feels strongly integrated into the platform model. The Micronaut version is similarly compact, with a reactive flavor and compile-time style.

For inspection and manual testing, I used the MCP Inspector to launch the packaged server and interactively exercise the tools.

Conclusions

There is no single best way to implement an MCP server in Java. There are four good options, and each one is best when its design assumptions match your context.

Choose the official MCP Java SDK for the most direct, portable, and protocol-centric implementation. Choose Spring AI when your team already lives in Spring Boot and wants the safest path from annotated Java methods to an MCP server. Choose Quarkus when production features such as performance, security integration, observability, validation, testing, and native deployment are first-class concerns. Choose Micronaut when you prefer a lighter framework style, like compile-time processing, and are comfortable adopting a younger but promising MCP integration.

The Selenium case study reinforces the main architectural takeaway: the smartest long-term design is to keep the business capability independent from the MCP framework layer. Once you do that, the protocol adapter becomes a replaceable detail, and the decision between SDK, Spring, Quarkus, and Micronaut becomes a question of ergonomics and operations rather than a rewrite of your core logic.

WebDriver BiDi: The Future of Browser Automation is Now

Boni García — Tue, 21 Oct 2025 06:20:01 GMT

This story summarizes a talk given by Boni García at the Quality Beacon conference in Copenhagen, Denmark, on October 21, 2025. You can find the original slides of this talk here.

What Is Browser Automation?

Browser automation means controlling a web browser using code — for testing, scraping, or performing repetitive tasks. Traditionally, tools like Selenium WebDriver, Puppeteer, Cypress, and Playwright have powered this automation revolution. But each came with different architectures, protocols, and quirks. WebDriver BiDi is changing that.

Enter WebDriver BiDi

WebDriver BiDi (short for bidirectional) is a W3C standard-in-progress that enables two-way, real-time communication between your automation scripts and the browser. Unlike the classic WebDriver (which used HTTP in a request-response model), BiDi uses WebSockets, allowing the browser to push events like network requests, console logs, or JavaScript exceptions directly to your code. In short:

🦾 More reliable automation
🌐 Better browser control
📃 One unified standard
⚒ Simpler tooling and fewer hacks

Specification: https://www.w3.org/TR/webdriver-bidi/

Why WebDriver BiDi Matters

Traditional W3C WebDriver works like a walkie-talkie: your code sends a command, waits, and gets a response. WebDriver BiDi, on the other hand, is like a live conversation — both sides talk freely. That means we can now:

Capture console logs in real time.
Intercept and modify network traffic.
React instantly to browser events.
Simulate complex user input with more precision.

BiDi merges two worlds:

The reliability of the W3C WebDriver protocol.
The real-time capabilities of the Chrome DevTools Protocol (CDP).

Selenium

Selenium WebDriver (often known as simply Selenium) is a multilanguage browser automation library. Selenium’s architecture is based on the W3C WebDriver standard, which defines a protocol for browser communication using JSON over HTTP.

Selenium has supported BiDi since version 4, with high-level APIs coming in Selenium 5. This way, currently both WebDriver and BiDi-based automation are possible with Selenium:

For enabling BiDi in Selenium you need:

@BeforeEach
void setup() {
    ChromeOptions options = new ChromeOptions();
    options.enableBiDi();
    driver = new ChromeDriver(options);
}

Once enabled, you can interact with BiDi modules like BrowsingContext, Input, Network, and Log.

Example: Browsing Context

@Test
void testBrowsing() {
    BrowsingContext context = new BrowsingContext(driver, driver.getWindowHandle());
    context.navigate("https://bonigarcia.dev/selenium-webdriver-java/");
    String screenshot = context.captureScreenshot(); // Base64 image
    assertThat(screenshot).isNotBlank();
}

Example: Listening to Logs

@Test
void testLog() {
    List logs = new ArrayList<>();
    try (LogInspector inspector = new LogInspector(driver)) {
        inspector.onConsoleEntry(logs::add);
    }
    driver.get("https://bonigarcia.dev/selenium-webdriver-java/console-logs.html");
    new WebDriverWait(driver, Duration.ofSeconds(5)).until(_ -> logs.size() > 3);
    logs.forEach(log -> System.out.println(log.getText()));
}

Puppeteer

Puppeteer is a Node.js browser automation library created and maintained by the Chrome DevTools team at Google since 2017. Puppeteer was initially tied to CDP, now supports BiDi for Firefox as of v23 — and uses it as the default protocol from Puppeteer 24 onward.

Example: Hello World with Puppeteer and BiDi

const puppeteer = require('puppeteer');

describe('Hello World with Puppeteer and BiDi', () => {
  let browser, page;

  beforeAll(async () => {
    browser = await puppeteer.launch({
      browser: 'firefox',
      protocol: 'webDriverBiDi',
    });
    page = await browser.newPage();
  });

  it('checks page title', async () => {
    await page.goto('https://bonigarcia.dev/selenium-webdriver-java/');
    const title = await page.title();
    expect(title).toContain('Selenium WebDriver');
  });

  afterAll(async () => await browser.close());
});java

Cypress

Cypress is a JavaScript end-to-end automated testing framework created as a company in 2014 to provide a seamless experience for automated web testing.

As of Cypress 14.1.0 (February 2025), Firefox automation runs over WebDriver BiDi by default. By Cypress 15 (August 2025), CDP support in Firefox was completely dropped.

Users don’t need to change a thing — BiDi runs transparently beneath the familiar Cypress API:

describe('Hello World Cypress', () => {
  it('checks title', () => {
    cy.visit('https://bonigarcia.dev/selenium-webdriver-java/');
    cy.title().should('include', 'Selenium WebDriver');
  });
});

Playwright

Playwright is a multilanguage end-to-end automated testing framework maintained by Microsoft since 2020. Playwright maintains patched versions of Chromium, Firefox, and WebKit (to enable automation and cross-browser consistency). Playwright uses an extended version of CDP to implement to control uniformly across these browsers.

Playwright is working toward BiDi integration (see issue). Although support is still experimental, the transition to WebDriver BiDi will likely happen automatically in a future version of Playwright once the protocol is mature enough to support all of Playwright’s features.

Conclusions

WebDriver BiDi is an in-progress W3C standard for the next generation of browser automation
It combines the stability of WebDriver with the power of CDP, offering a single, standard way to automate browsers
Major tools like Selenium, Puppeteer, Cypress, and Playwright are actively integrating WebDriver BiDi
Selenium: BiDi low-level features available in Selenium 4, high-level API is in development (planned for Selenium 5)
Puppeteer: BiDi support for Firefox since v23
Cypress: BiDi support for Firefox since v14.1.0
Playwright: BiDi support is still experimental

How to track the evolution of WebDriver BiDi?

JUnit vs. TestNG: Which Framework Fits Your Testing Strategy?

Boni García — Tue, 14 Oct 2025 11:07:11 GMT

This story summarizes a talk given by Boni García at the JavaCro’25 conference in Rovinj, Croatia, on October 14, 2025. You can find the original slides of this talk here.

Introduction

A unit testing framework is a tool that provides structure and reusable components to write, organize, and run automated tests for given pieces of code, ensuring they behave as expected. This story reviews two of Java’s most popular unit testing frameworks: JUnit and TestNG.

JUnit

Created by Kent Beck and Erich Gamma in 1999, JUnit quickly became the de facto testing library for Java. Its evolution is as follows:

JUnit 4: widespread adoption; annotation-based model (@Test, @Before, etc.)
JUnit 5 (2017): major redesign including the JUnit Platform, the foundation of the JUnit 5 testing framework, providing the launching infrastructure for running tests and defining the API for test engines (like JUnit Jupiter or Vintage) to discover and execute tests.
JUnit 6 (2025): the latest release (September 30, 2025), supporting Java 17 and cleaning up deprecated APIs

TestNG

Created by Cédric Beust in 2004, TestNG was designed to fix some of JUnit’s early limitations. Its key features include:

Grouping and filtering via annotations
Native parallel execution
Built-in data providers for parameterized tests

Comparison

In this story, we’ll look at seven areas side-by-side:

Test lifecycle (basics)
Parameterized tests
Categorizing & filtering tests
Conditional test execution
Ordering tests
Parallel execution
Advanced test lifecycle

As a real use case, this comparison will be done using Selenium to illustrate the key differences between JUnit and TestNG. Selenium is a browser automation library, not a testing framework. However, the main use case of Selenium is end-to-end automated testing. For this reason, it is typically used in conjunction with a unit testing framework like JUnit or TestNG.

This story results from the work on developing examples of two books: Mastering Software Testing with JUnit 5 (Packt Publishing, 2017) and Hands-On Selenium WebDriver with Java (O’Reilly Media, 2022). The test examples are presented here in the following open-source repositories: mastering-junit5 and selenium-webdriver-java.

1. Test lifecycle (basics)

The test lifecycle is the sequence of steps a testing framework follows to set up the test fixture (initial state), execute the test(s), and clean up afterward. The following picture shows that the basic test lifecycle is similar in JUnit and TestNG, but with changes in the annotations’ names.

Example #1.1: basic test with Selenium — JUnit

import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.annotations.AfterMethod;
import org.testng.annotations.BeforeMethod;
import org.testng.annotations.Test;

class HelloWorldSeleniumJupiterTest {
     WebDriver driver;
     @BeforeEach
    void setup() {
        driver = new ChromeDriver();
    }
     @Test
    void test() {
        // Test logic
    }

    @AfterEach
    void teardown() {
        driver.quit();
    }
}

Example #1.2: basic test with Selenium — TestNG

import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.annotations.AfterMethod;
import org.testng.annotations.BeforeMethod;
import org.testng.annotations.Test;

public class HelloWorldSeleniumNGTest {
     WebDriver driver;
     @BeforeMethod
    public void setup() {
        driver = new ChromeDriver();
    }
     @Test
    public void test() {
        // Test logic
    }

    @AfterMethod
    public void teardown() {
        driver.quit();
    }
}

2. Parameterized tests

A parameterized test is a test that runs multiple times with different input values, allowing us to reuse the same logic across varied datasets.

To implement a parameterized test in JUnit, we need to:

Use @ParameterizedTest (instead of @Test) or @ParameterizedClass (in addition to @Test)
Use an argument provider (to define the dataset)

There are two ways of implementing parameterized tests in TestNG:

Using @DataProvider (most common and scalable)
Using @Parameters + testng.xml (dataset in )

Example #2.1: data-driven test case with Selenium — JUnit

class LoginJupiterTest extends BrowserParent {
     static Stream loginData() {
        return Stream.of(Arguments.of("user", "user", "Login successful"),
                Arguments.of("bad-user", "bad-passwd", "Invalid credentials"));
    }

    @ParameterizedTest
    @MethodSource("loginData")
    void testLogin(String username, String password, String expectedText) {
        // Test logic
    }
}

Example #2.2: data-driven test case with Selenium — TestNG

public class LoginNGTest extends BrowserParent {
     @DataProvider(name = "loginData")
    public static Object[][] data() {
        return new Object[][] { { "user", "user", "Login successful" },
                { "bad-user", "bad-passwd", "Invalid credentials" } };
    }
     @Test(dataProvider = "loginData")
    public void testLogin(String username, String password, String expectedText) {
        // Test logic
    }
 }

Example #3.1: cross-browser testing with Selenium — JUnit

@ParameterizedClass
@ArgumentsSource(CrossBrowserProvider.class)
class CrossBrowserParent {
     @Parameter
    WebDriver driver;
     @AfterEach
    void teardown() {
        driver.quit();
    }
 }

class CrossBrowserProvider implements ArgumentsProvider {
     @Override
    public Stream provideArguments(
            ExtensionContext context) {
        ChromeDriver chrome = new ChromeDriver();
        FirefoxDriver firefox = new FirefoxDriver();
         return Stream.of(Arguments.of(chrome), Arguments.of(firefox));
    }
 }

class CrossBrowserJUnitTest extends CrossBrowserParent {
     @Test
    void test() {
        // Test logic
    }
}

Example #3.2: cross-browser testing with Selenium — TestNG

public class CrossBrowserNGTest extends CrossBrowserParent {
     @Test(dataProvider = "browserProvider")
    public void test(WebDriver driver) {
        this.driver = driver;
         // Test logic
    }
 }

public class CrossBrowserParent {
     WebDriver driver;
     @DataProvider(name = "browserProvider")
    public static Object[][] data() {
        ChromeDriver chrome = new ChromeDriver();
        FirefoxDriver firefox = new FirefoxDriver();
         return new Object[][] { { chrome }, { firefox } };
    }
     @AfterMethod
    void teardown() {
        driver.quit();
    }
 }

3. Categorizing & filtering tests

Categorizing and filtering allows us to group tests into categories and run only the ones that match specific criteria.

In JUnit, test classes and methods can be tagged in JUnit using @Tag. In TestNG, test methods can be grouped using the attribute groups in @Test. Those categories (tags or groups) can later be used to filter test discovery and execution.

Example #5.1: grouping Selenium tests — JUnit

class CategoriesJUnitTest extends BrowserParent {
     @Test
    @Tag("WebForm")
    void testCategoriesWebForm() {
        // Test logic
    }
     @Test
    @Tag("HomePage")
    void testCategoriesHomePage() {
        // Test logic
    }
 }

mvn test -Dgroups=HomePage

gradle test -Pgroups=HomePage

Example #5.2: grouping Selenium tests — TestNG

public class CategoriesNGTest extends BrowserGroupsParent {
     @Test(groups = { "WebForm" })
    public void testCategoriesWebForm() {
        // Test logic
    }
     @Test(groups = { "HomePage" })
    public void tesCategoriestHomePage() {
        // Test logic
    }
}

mvn test -Dtest=CategoriesNGTest -DexcludedGroups=HomePage

gradle test --tests CategoriesNGTest -PexcludedGroups=HomePage

4. Conditional test execution

Conditional test execution allows us to enable or skip tests based on predefined conditions. JUnit provides a rich set of built-in annotations for skipping tests (@Disabled and others). Also, we can use Assumptions to disable tests in runtime. TestNG provides the annotation @Ignore and attributes in @Test (e.g., enabled=false) to run conditionally. Also, we can use SkipException to disable tests in runtime.

The JUnit annotations for disabling tests are the following:

Example #6.1: skipping tests (I) — JUnit

class DisabledJupiterTest {
    @Disabled("Optional reason for disabling")
    @Test
    public void testDisabled1() {
        // Test logic
    }
    @DisabledOnJre(JAVA_17)
    @Test
    public void testDisabled2() {
        // Test logic
    }
    @EnabledOnOs(MAC)
    @Test
    public void testDisabled3() {
        // Test logic
    }

}

Example #6.2: skipping tests (I) — TestNG

public class DisabledNGTest {
     @Ignore("Optional reason for disabling")
    @Test
    public void testDisabled1() {
        // Test logic
    }
     @Test(enabled = false)
    public void testDisabled2() {
        // Test logic
    }
 }

Example #7.1: skipping tests (II) — JUnit

class ConditionalJupiterTest {
     @Test
    public void testConditional() {
        boolean condition = false; // runtime condition
        Assumptions.assumeTrue(condition);
         // Test logic
    }
}

Example #7.2: skipping tests (II) — TestNG

public class ConditionalNGTest {
     @Test
    public void testConditional() {
        boolean condition = false; // runtime condition
        if (!condition) {
            throw new SkipException("Skipping test");
        }
         // Test logic
    }
 }

5. Ordering tests

Ordering tests is used to control the sequence in which tests are executed. The default order for test execution are:

In JUnit, tests are run in an unspecified order (not guaranteed, deterministic algorithm that is but intentionally nonobvious meant to be independent)
In TestNG, tests are run in alphabetical order by method name

To change this behavior:

In JUnit, we use @TestMethodOrder with @Order.
TestNG, we use priority and dependsOnMethods in @Test, or class order in testng.xml

Example #8.1: reuse the same browser to run tests in a given order — JUnit

@TestInstance(Lifecycle.PER_CLASS)
@TestMethodOrder(OrderAnnotation.class)
class OrderJunitTest {
    WebDriver driver;

    @BeforeAll
    void setup() {
        driver = new ChromeDriver();
    }

    @Test
    @Order(1)
    void testA() {
        // Test logic
    }

    @Test
    @Order(2)
    void testB() {
        // Test logic
    }

    @AfterAll
    void teardown() {
        driver.quit();
    }
}

Example #8.2: reuse the same browser to run tests in a given order — TestNG

public class OrderNGTest {
    WebDriver driver;

    @BeforeClass
    public void setup() {
        driver = new ChromeDriver();
    }

    @Test(priority = 1)
    public void testA() {
        // Test logic
    }

    @Test(priority = 2)
    public void testB() {
        // Test logic
    }

    @AfterClass
    public void teardown() {
        driver.quit();
    }
 }

6. Parallel execution

Parallel test execution allows us to run multiple tests simultaneously to speed up execution. JUnit provides different configuration parameters to tests in parallel:

junit.jupiter.execution.parallel.enabled (to enable test parallelism)
junit.jupiter.execution.parallel.mode.classes.default (to run test classes in parallel)
junit.jupiter.execution.parallel.mode.default (to run test methods in parallel)

These parameters can be specified using a configuration file or in runtime trough annotations (@Execution).

On the ther hand, TestNG enables parallelism using the testng.xml config file.

Example #9.1: run Selenium tests in parallel — JUnit

junit.jupiter.execution.parallel.enabled = true
junit.jupiter.execution.parallel.mode.default = concurrent
junit.jupiter.execution.parallel.mode.classes.default = same_thread

@Execution(ExecutionMode.CONCURRENT)
class Parallel1JupiterTest extends BrowserParent {
    @Test
    void testParallel1() {
        // Test logic
    }
 }

@Execution(ExecutionMode.CONCURRENT)
class Parallel2JupiterTest extends BrowserParent {
    @Test
    void testParallel2() {
        // Test logic
    }
 }

Example #9.2: run Selenium tests in parallel — TestNG

7. Advanced test lifecycle

In JUnit 5+, the extension model provides comprehensive capabilities to customize and hook into the test lifecycle at various points.

The following diagram shows the JUnit 5+ test execution lifecycle and the order in which user-defined annotations and extension callbacks run. Callbacks are extension hooks, and annotations are user code methods executed around each test lifecycle.

On the other hand, TestNG provides a rich set of listeners to intercept lifecycle events for tests and suites:

Example #10.1: retrying Selenium tests (to detect flakiness) — JUnit

@ExtendWith(RetryExtension.class)
class RandomCalculatorJupiterTest extends BrowserParent {
    @Test
    void testRandomCalculator() {
        // Test logic
    }
 }

class RandomCalculatorJupiterTest extends BrowserParent {

    @RegisterExtension
    Extension failureWatcher = new RetryExtension(5);

    @Test
    void testRandomCalculator() {
        // Test logic
    }
 }

public class RetryExtension implements TestExecutionExceptionHandler {
    static final int DEFAULT_MAX_RETRIES = 3;
    final AtomicInteger retryCount = new AtomicInteger(1);
    final AtomicInteger maxRetries = new AtomicInteger(DEFAULT_MAX_RETRIES);

    public RetryExtension() {
        // Default constructor
    }

    public RetryExtension(int maxRetries) {
        this.maxRetries.set(maxRetries);
    }

    @Override
    public void handleTestExecutionException(ExtensionContext extensionContext,
            Throwable throwable) throws Throwable {
        // Manage throwable depending on the retry count
    }
 }

Example #10.2: retrying Selenium tests (to detect flakiness) — TestNG

public class RandomCalculatorNGTest extends BrowserParent {
    @Test(retryAnalyzer = RetryAnalyzer.class)
    @Retry(5)
    public void testRandomCalculator() {
        // Test logic
    }
 }

@Retention(RetentionPolicy.RUNTIME)
public @interface Retry {
    int value();
}

public class RetryAnalyzer implements IRetryAnalyzer {
    static final int DEFAULT_MAX_RETRIES = 3;
    final AtomicInteger retryCount = new AtomicInteger(1);

    @Override
    public boolean retry(ITestResult result) {
        Method method = result.getMethod().getConstructorOrMethod().getMethod();
        int maxRetries = DEFAULT_MAX_RETRIES;
        if (method.isAnnotationPresent(Retry.class)) {
            Retry retry = method.getAnnotation(Retry.class);
            maxRetries = retry.value();
        }
        if (retryCount.get() <= maxRetries) {
            logError(result.getThrowable());
            retryCount.incrementAndGet();
            return true;
        }
        return false;
    }

    private void logError(Throwable e) {
        System.err.println("Attempt test execution #" + retryCount.get()
                + " failed (" + e.getClass().getName() + "thrown):  "
                + e.getMessage());
    }
 }

Example #11.1: gather data (e.g., browser screenshot) if test fails — Java

@ExtendWith(FailureWatcher.class)
class FailureJupiterTest extends BrowserParent {
    @Test
    void testFailure() {
        // Test logic
        fail("Forced error");
    }
 }

public class FailureWatcher implements TestExecutionExceptionHandler {
    @Override
    public void handleTestExecutionException(ExtensionContext context,
            Throwable throwable) throws Throwable {
         context.getTestInstance().ifPresent(testInstance -> {
            WebDriver driver = (WebDriver) SeleniumUtils
                    .getFieldFromTestInstance(testInstance, "driver");
            SeleniumUtils.getScreenshotAsFile(driver, context.getDisplayName());
        });
         throw throwable;
    }
}

Example #11.2: gather data (e.g., browser screenshot) if test fails — TestNG

public class FailureNGTest {
    WebDriver driver;

    @BeforeMethod
    public void setup() {
        driver = new ChromeDriver();
    }

    @AfterMethod
    public void teardown(ITestResult result) {
        if (result.getStatus() == ITestResult.FAILURE) {
            SeleniumUtils.getScreenshotAsFile(driver, result.getName());
        }
         driver.quit();
    }

    @Test
    public void testFailure() {
        // Test logic
        fail("Forced error");
    }
 }

Example #12.1: reporting test suite — Java

@ExtendWith(Reporter.class)
class Report1JupiterTest extends BrowserParent {
    @Test
    void testReport1() {
        // Test logic
    }
 }

@ExtendWith(Reporter.class)
class Report2JupiterTest extends BrowserParent {
    @Test
    void testReport2() {
        // Test logic
    }
 }

public class Reporter implements BeforeAllCallback, BeforeEachCallback,
        AfterTestExecutionCallback {
    static final String REPORT_NAME = "report-junit.html";
    ExtentReports report;
    ExtentTest test;

    @Override
    public void beforeAll(ExtensionContext context) throws Exception {
        Store store = context.getRoot()
                .getStore(ExtensionContext.Namespace.create(STORE_NAMESPACE));
        report = store.get(STORE_NAME, ExtentReports.class);
        if (report == null) {
            report = new ExtentReports();
            store.put(STORE_NAME, report);
             Runtime.getRuntime().addShutdownHook(new Thread(report::flush));
        }
        ExtentSparkReporter htmlReporter = new ExtentSparkReporter(REPORT_NAME);
        report.attachReporter(htmlReporter);
    }

    @Override
    public void beforeEach(ExtensionContext context) throws Exception {
        test = report.createTest(context.getDisplayName());
    }

    @Override
    public void afterTestExecution(ExtensionContext context) throws Exception {
        context.getTestInstance().ifPresent(testInstance -> {
            // Take screenshot 
            test.addScreenCaptureFromBase64String(screenshot);
        });
    }
 }

Example #12.2: reporting test suite — TestNG

@Listeners(Reporter.class)
public class Report1NGTest extends BrowserParent {
    @Test
    public void testReport1() {
        // Test logic
    }
 }

@Listeners(Reporter.class)
public class Report2NGTest extends BrowserParent {
    @Test
    public void testReport2() {
        // Test logic
    }
 }

public class Reporter implements ITestListener {
     static final String REPORT_NAME = "report-testng.html";
    ExtentReports report;
    ExtentTest test;

    @Override
    public void onStart(ITestContext context) {
        ITestListener.super.onStart(context);
        report = new ExtentReports();
        ExtentSparkReporter htmlReporter = new ExtentSparkReporter(REPORT_NAME);
        report.attachReporter(htmlReporter);
    }

    @Override
    public void onTestStart(ITestResult result) {
        ITestListener.super.onTestStart(result);
        test = report.createTest(result.getName());
    }

    @Override
    public void onTestSuccess(ITestResult result) {
        ITestListener.super.onTestSuccess(result);
        // Take screenshot 
        test.addScreenCaptureFromBase64String(screenshot);
    }

    @Override
    public void onFinish(ITestContext context) {
        ITestListener.super.onFinish(context);
        report.flush();
    }
 }

Conclusions

Both JUnit and TestNG provide a comprehensive programming model for developing advanced tests in Java. The similar aspects in JUnit and TestNG are the following:

Basic test lifecycle
Categorizing and filtering tests
Ordering tests
Parallel test execution

The strong points of JUnit are:

Parameterized tests
Conditional test execution
Extension model

The strong points in TestNG is:

Test listeners

Browser Automation with Java

Boni García — Mon, 13 Oct 2025 18:48:41 GMT

Browser automation is a key ingredient for end-to-end testing. At the JavaCro’25 conference, Boni García delivered a presentation offering a deep dive into two of the most popular browser automation tools available today: Selenium and Playwright. This article captures the essence of that talk, providing an overview for anyone looking to get started on browser automation in the Java ecosystem. You can get the original slides of this talk here.

What is Browser Automation?

Browser automation is the process of using software or scripts to control a web browser and perform tasks automatically, without manual human intervention. The primary use cases for browser automation include:

Test automation: This is the most common application, encompassing end-to-end testing to verify web applications.
Web scraping: Automating the extraction of large amounts of data from websites.
Automating repetitive tasks for web pages: Automating mundane tasks like filling out forms or generating reports from web interfaces.

The world of browser automation is rich with tools, each with its own philosophy and strengths. Some of the key players include Selenium, Playwright, Cypress, Puppeteer, TestCafe, and WebdriverIO. This story focuses on the most prominent choices for Java developers: Selenium and Playwright.

Selenium

Selenium is a browser automation library, and it has been considered the de facto standard for browser automation for many years. Selenium is:

Multi-language: Officially supported in Java, JavaScript, Python, .NET, and Ruby.
Cross-browser: Compatible with all major browsers like Chrome, Firefox, Safari, and Edge.
Open-source and community-driven since 2004.

Selenium’s architecture is based on the W3C WebDriver standard, which defines a protocol for browser communication using JSON over HTTP. This standards-based approach is a key strength, ensuring broad compatibility.

It’s important to understand that Selenium is not a testing framework because it doesn’t include a test runner, assertion library, or reporting features. For this, Selenium users relies on a rich ecosystem of tools like JUnit or TestNG (unit testing frameworks), AssertJ (fluent assertions), or Allure and ExtentReports (reporting), among others.

Selenium Manager

A significant recent improvement is Selenium Manager. Shipped with Selenium out of the box, it automatically discovers, downloads, and caches the browser drivers required for automation (e.g., chromedriver for Chrome, geckodriver for Firefox, etc.), greatly simplifying project setup. Consider the following test example. Internally, Selenium is using Selenium Manager to discover, download, and cache the needed driver, geckodriver, in this example:

import static org.assertj.core.api.Assertions.assertThat;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;

class HelloWorldFirefoxJupiterTest {

    WebDriver driver;

    @BeforeEach
    void setup() {
        driver = new FirefoxDriver();
    }

    @AfterEach
    void teardown() {
        driver.quit();
    }

    @Test
    void test() {
        driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
        assertThat(driver.getTitle()).contains("Selenium WebDriver");
    }

}

Moreover, Selenium Manager also automatically discovers, downloads, and caches the browsers driven with Selenium. For example, if Firefox is unavailable in the previous test, Selenium Manager will manage (discover, download, and cache) the latest version of Firefox. And in addition, specific browser versions can be specified as follows (including “beta”, “dev”, or “nightly”), for example, as follows:

import static org.assertj.core.api.Assertions.assertThat;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;

class ChromeVersionTest {

    WebDriver driver;

    @BeforeEach
    void setup() {
        ChromeOptions options = new ChromeOptions();
        options.setBrowserVersion("beta");
        driver = new ChromeDriver(options);
    }

    @Test
    void test() {
        driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
        String title = driver.getTitle();
        assertThat(title).contains("Selenium WebDriver");
    }

    @AfterEach
    void teardown() {
        driver.quit();
    }

}

Selenium-Java Ecosystem

A key strength for developing end-to-end tests with Selenium and Java is the richness of their respective ecosystem. The following examples demonstrate how. For instance, to implement cross-browser testing (i.e., reuse the same test logic for different browsers), we can use the parameterized test support by JUnit, as follows (see complete example here):

import static org.assertj.core.api.Assertions.assertThat;
import org.junit.jupiter.api.Test;

class CrossBrowserTest extends CrossBrowserParent {

    @Test
    void test() {
        driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
        assertThat(driver.getTitle()).contains("Selenium WebDriver");
    }

}

import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.params.Parameter;
import org.junit.jupiter.params.ParameterizedClass;
import org.junit.jupiter.params.provider.ArgumentsSource;
import org.openqa.selenium.WebDriver;

@ParameterizedClass
@ArgumentsSource(CrossBrowserProvider.class)
class CrossBrowserParent {

    @Parameter
    WebDriver driver;

    @AfterEach
    void teardown() {
        driver.quit();
    }

}

import java.util.stream.Stream;
import org.junit.jupiter.api.extension.ExtensionContext;
import org.junit.jupiter.params.provider.Arguments;
import org.junit.jupiter.params.provider.ArgumentsProvider;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.firefox.FirefoxDriver;

public class CrossBrowserProvider implements ArgumentsProvider {

    @Override
    public Stream provideArguments(
            ExtensionContext context) {
        ChromeDriver chrome = new ChromeDriver();
        FirefoxDriver firefox = new FirefoxDriver();

        return Stream.of(Arguments.of(chrome), Arguments.of(firefox));
    }

}

For video recording, you can use WebDriverManager, both using browsers in Docker containers (example) or by using a web extension called BrowerWatcher to record only the viewport (example).

import static org.assertj.core.api.Assertions.assertThat;
import java.time.Duration;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.openqa.selenium.WebDriver;
import io.github.bonigarcia.wdm.WebDriverManager;

class DockerChromeRecordingTest {

    WebDriver driver;
    WebDriverManager wdm;

    @BeforeEach
    void setupTest() {
        wdm = WebDriverManager.chromedriver().browserInDocker()
                .enableRecording();
        driver = wdm.create();
    }

    @Test
    void test() {
        driver.get("https://bonigarcia.dev/selenium-webdriver-java/");
        assertThat(driver.getTitle()).contains("Selenium WebDriver");
    }

    @AfterEach
    void teardown() throws InterruptedException {
        // FIXME: pause for manual browser inspection
        Thread.sleep(Duration.ofSeconds(3).toMillis());
        wdm.quit();
    }

}

import java.nio.file.Path;
import java.time.Duration;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.support.ui.ExpectedConditions;
import org.openqa.selenium.support.ui.WebDriverWait;
import org.slf4j.Logger;
import io.github.bonigarcia.wdm.WebDriverManager;

class RecordEdgeTest {

    WebDriver driver;
    WebDriverManager wdm;

    @BeforeEach
    void setup() {
        wdm = WebDriverManager.edgedriver().watch();
        driver = wdm.create();
    }

    @AfterEach
    void teardown() {
        driver.quit();
    }

    @Test
    void test() {
        driver.get(
                "https://bonigarcia.dev/selenium-webdriver-java/slow-calculator.html");

        wdm.startRecording();

        // 1 + 3
        driver.findElement(By.xpath("//span[text()='1']")).click();
        driver.findElement(By.xpath("//span[text()='+']")).click();
        driver.findElement(By.xpath("//span[text()='3']")).click();
        driver.findElement(By.xpath("//span[text()='=']")).click();

        // ... should be 4, wait for it
        WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
        wait.until(ExpectedConditions.textToBe(By.className("screen"), "4"));

        wdm.stopRecording();

        Path recordingPath = wdm.getRecordingPath();
        assertThat(recordingPath).exists();
    }

}

Playwright

Playwright is a newer, modern alternative from Microsoft that has quickly gained popularity. Like Selenium, it is open-source, multi-language and cross-browser. We can define Playwright as follows:

For Node.js, Playwright is a end-to-end testing framework.
For Java, Python, and .NET, Playwright is a browser automation library.

The difference is caused by the Playwright Test Runner (@playwright/test), a full-featured testing framework bundled with Playwright in Node.js, so it is only available for JavaScript/TypeScript developers. The Playwright Test runner provides the following features:

Test runner and assertions (similar to JUnit/TestNG)
Built-in fixtures (browser/page/context lifecycle)
Parallel test execution across multiple browsers/devices
Retries mechanism
HTML reporting
Video capture when failures
API testing (built-in request fixture)
Visual comparisons (expect(page).toHaveScreenshot())
Component testing (for React/Vue/Svelte/Angular)

Regarding its architecture, Playwright takes a different architectural approach than Selenium. It maintains its own patched versions of Chromium, Firefox, and WebKit and it uses internally an extended version of the Chrome DevTools Protocol (CDP) and a custom WebSocket-based protocol to control these browsers uniformly.

This way, for creating complete end-to-end tests with Playwright and Java, we typically use the Playwright API plus other tools, such as a unit testing framework (e.g., JUnit, TestNG), reporting tools, etc. For example:

import static org.assertj.core.api.Assertions.assertThat;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import com.microsoft.playwright.Browser;
import com.microsoft.playwright.Page;
import com.microsoft.playwright.Playwright;

class HelloWorldPlaywrightTest {

    Browser browser;
    Page page;

    @BeforeEach
    void setup() {
        browser = Playwright.create().chromium().launch();
        page = browser.newContext().newPage();
    }

    @Test
    void test() {
        page.navigate("https://bonigarcia.dev/selenium-webdriver-java/");
        String title = page.title();
        assertThat(title).contains("Selenium WebDriver");
    }

    @AfterEach
    void teardown() {
        browser.close();
    }

}

Advanced features

One of the most attractive features of Playwright compared to Selenium is its built-in automatic waiting mechanism. Playwright intelligently waits for elements to be ready before performing actions, such as clicking or typing, removing the need for explicit waits. The following test illustrates this feature:

import static org.assertj.core.api.Assertions.assertThat;
import java.nio.file.Paths;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import com.microsoft.playwright.Browser;
import com.microsoft.playwright.BrowserType;
import com.microsoft.playwright.Page;
import com.microsoft.playwright.Playwright;

class SlowLoginPlaywrightTest {

    Browser browser;
    Page page;

    @BeforeEach
    void setup() {
        browser = Playwright.create().chromium()
                .launch(new BrowserType.LaunchOptions().setHeadless(false));
        page = browser.newContext().newPage();
    }

    @Test
    void test() {
        // Open system under test (SUT)
        page.navigate(
                "https://bonigarcia.dev/selenium-webdriver-java/login-slow.html");

        // Log in
        page.fill("#username", "user");
        page.fill("#password", "user");
        page.click("button[type='submit']");

        // Assert expected text
        String successText = page.textContent("#success");
        assertThat(successText).contains("Login successful");

        // Take screenshot
        page.screenshot(new Page.ScreenshotOptions()
                .setPath(Paths.get("slow-login-playwright.png")));
    }

    @AfterEach
    void teardown() {
        browser.close();
    }

}

Another appealing feature of Playwright is the trace viewer, a powerful debugging tool that lets you replay your test execution step by step. It records snapshots, network activity, console logs, and DOM states during a test run, allowing you to visually inspect what happened at each point and easily identify the cause of failures. For example:

import static org.assertj.core.api.Assertions.assertThat;
import java.nio.file.Paths;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import com.microsoft.playwright.Browser;
import com.microsoft.playwright.BrowserType;
import com.microsoft.playwright.Page;
import com.microsoft.playwright.Playwright;

class SlowLoginPlaywrightTest {

    Browser browser;
    Page page;

    @BeforeEach
    void setup() {
        browser = Playwright.create().chromium()
                .launch(new BrowserType.LaunchOptions().setHeadless(false));
        page = browser.newContext().newPage();
    }

    @Test
    void test() {
        // Open system under test (SUT)
        page.navigate(
                "https://bonigarcia.dev/selenium-webdriver-java/login-slow.html");

        // Log in
        page.fill("#username", "user");
        page.fill("#password", "user");
        page.click("button[type='submit']");

        // Assert expected text
        String successText = page.textContent("#success");
        assertThat(successText).contains("Login successful");

        // Take screenshot
        page.screenshot(new Page.ScreenshotOptions()
                .setPath(Paths.get("slow-login-playwright.png")));
    }

    @AfterEach
    void teardown() {
        browser.close();
    }

}

Video recording is another built-in feature in Playwright for Java, for example, as follows:

import static org.assertj.core.api.Assertions.assertThat;
import java.nio.file.Paths;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import com.microsoft.playwright.Browser;
import com.microsoft.playwright.BrowserType;
import com.microsoft.playwright.Page;
import com.microsoft.playwright.Playwright;

class RecordingSlowLoginPlaywrightTest {

    Browser browser;
    Page page;

    @BeforeEach
    void setup() {
        browser = Playwright.create().chromium()
                .launch(new BrowserType.LaunchOptions().setHeadless(false));
        Browser.NewContextOptions options = new Browser.NewContextOptions()
                .setRecordVideoDir(Paths.get("."));
        page = browser.newContext(options).newPage();
    }

    @Test
    void test() {
        // Open system under test (SUT)
        page.navigate(
                "https://bonigarcia.dev/selenium-webdriver-java/login-slow.html");

        // Log in
        page.fill("#username", "user");
        page.fill("#password", "user");
        page.click("button[type='submit']");

        // Assert expected text
        String successText = page.textContent("#success");
        assertThat(successText).contains("Login successful");
    }

    @AfterEach
    void teardown() {
        browser.close();
    }

}

Conclusions

Selenium and Playwright cannot be directly compared since they are naturally different. The following table summarizes their key differences:

Finally, we can find both pros and cons in Selenium and Playwright in Java, namely:

WebDriverManager 6: Automated driver management and other helper features for Selenium WebDriver in…

Boni García — Wed, 19 Mar 2025 15:50:52 GMT

WebDriverManager 6: Automated driver management and other helper features for Selenium WebDriver in Java

WebDriverManager was first released on 19 March 2015. At that time, Selenium was in version 2 (i.e., the first release of “Selenium WebDriver”), and it was considered a “non-batteries included” library. That means that, to control a browser programmatically with Selenium (such as Chrome or Firefox), Selenium users first should manage (i.e., download, setup, and maintain) the required drivers by Selenium (e.g., chromedriver for Chrome or geckodriver for Firefox) manually.

I found that manual process suboptimal. I used to think that, in an ideal world for browser automation, the driver management process should be automated. I looked for a solution to that, but I did not find any tool doing this. At the same time, I was working as a first-year lecturer at the university. I was chosen to give a course on “web programming.” Since Selenium was an important part of my career (I have used Selenium RC in my PhD dissertation and Selenium WebDriver in my research activities), I prepared a lecture about automated web testing using JUnit 4 and Selenium 2. But I didn’t like the driver thing. I wanted my students to go directly to the point and start playing with Selenium without wasting their energy in the driver setup. So, I created WebDriverManager 1.0.0 and recommended my students to use it.

In the following years, and thanks to the power of open source, WebDriverManager started to be used by others. I felt that WebDriverManager was addressing a real need of Selenium developers since the project began to grow, and people contributed to the WebDriverManager repo with pull requests, bug fixes, comments, etc. Moreover, the WebDriverManager concept was migrated from Java to other languages like webdriver-manager for JavaScript, webdriver-manager for Python, or WebDriverManager.Net for C#. This way, I have continued maintaining and incorporating more and more features to WebDriverManager in my leisure time since then. Nowadays, WebDriverManager is a well-known Java library used by hundreds of thousands of projects, with around 3 million downloads monthly in Maven Central.

WebDriverManager 6

To celebrate the 10th birthday of the project, WebDriverManager 6 has been released on 19 March 2025. This new major feature ships some important novelties.

Support to docker-selenium

One of the most relevant features shipped in WebDriverManager 5 was the ability to create browsers in Docker containers out of the box. To support this feature, WebDriverManager used the Docker images by WebDriverManager created and maintained by Aerokube. Unfortunately, these images have not been available since December 2024. Luckily, we have docker-selenium, a project maintained by Selenium that provides Docker images for running Selenium tests in containers. I have always wanted to support docker-selenium in WebDriverManager, but the lack of time stopped me. However, the sudden unavailability of Aerokube’s images forced me to migrate to docker-selenium quickly.

Luckily, the WebDriverManager API is the same after this significant change. Appealing features, such as browser recording, noVNC access, or unstable versions (i.e., beta and dev releases), are still supported. Even better, WebDriverManager supports ARM64 Docker images with docker-selenium thanks to the seleniarm images.

https://medium.com/media/9504357ca68db27af8e10f8ca0cb426a/href

Improved browser version discovery

Browser version discovery was a key feature first introduced in WebDriverManager 3. Knowing the local browser version is key to selecting the proper driver version. To that aim, WebDriverManager has an internal knowledge database to discover browser versions using shell commands (e.g., google-chrome --version for Chrome in Linux).

WebDriverManager used WMIC (Windows Management Instrumentation Command-line) in Windows for browser version discovery. However, Microsoft announced that WMIC is deprecated as of Windows 10, version 21H1, and will be removed in a future version. Therefore, WebDriverManger 6 started to use PowerShell commands to discover browser versions. This feature is transparent for end-users, and it will guarantee that automated driver management will continue adequately in the upcoming years.

Finally, WebDriverManager 6 included a new API method called .browserBinary() to provide a better configuration capability regarding browser version discovery. This method allows the specification of the browser binary path used for driver version discovery.

https://medium.com/media/b0fe0cd821e26087a86c125481458015/href

As usual, this feature can also be configured using Java properties per the different browsers: wdm.chromeBinary, wdm.operaBinary, wdm.edgeBinary, wdm.firefoxBinary, and wdm.chromiumBinary.

Support to snap browsers/drivers

Some browsers, like Firefox and Chromium, are distributed through a packaging and deployment system called snap. Snap packages are self-contained and sandboxed, including all the dependencies required to run the application. For browsers, that means that both the browser and the driver are included in the snap package. As of release 6, WebDriverManager can detect if the local browser has been installed using snap, using the proper driver (which should be already locally installed). Again, this feature is transparent for WebDriverManager users. If the browser and driver are available, it can be used automatically in a Selenium session managed by WebDriverManager.

Selenium Manager

But wait. What about Selenium Manager? Is it not the same than WebDriverManager? Let me explain the little history of Selenium Manager.

In 2021, the Selenium project published the results of the first official Selenium survey. One of the key findings in this study was that Selenium users wanted “batteries included.” In other words, they want Selenium to manage browsers and drivers, similarly to the features provided by WebDriverManager. This way, I joined Sauce Labs as a Staff Software Engineer in the Open Source Program Office from 2022 to 2023 and contributed to the Selenium project, particularly developing Selenium Manager. Selenium Manager has been developed in Rust following the lessons learned from WebDriverManager, implementing the same driver resolution algorithm first designed in WebDriverManager. Selenium Manager is fully integrated into Selenium and is used by all the official binding languages: Java, JavaScript, Python, Ruby, and .Net. As shown in its usage statistics (publicly available through Plausible), Selenium Manager is used daily by hundreds of thousands of unique users worldwide. So, below, you can find some differences between WebDriverManager and Selenium Manager.

Is Selenium Manager a replacement for WebDriverManager? For the use case of automated driver management, yes. In other words, if you use WebDriverManager only for driver management, you can safely switch to Selenium Manager.

What are the key differences between WebDriverManager and Selenium Manager? Both projects provide automated driver management (chromedriver, geckodriver, etc.). However, WebDriverManager ships several unavailable features in Selenium Manager (e.g., self-managed browsers in Docker containers or custom monitoring features). On the other side, Selenium Manager provides automated browser management using the browser binary distributions for Windows, macOS, and Linux (e.g., based on Chrome for Testing).

What are the reasons to continue using WebDriverManager? There can be different reasons, such as:

Advanced features. WebDriverManager provides self-managed browsers in Docker containers (now through docker-selenium). This feature allows you to delegate all the browser infrastructure management (including dev and beta browser releases) to WebDriverManager with Docker. Also, it enables the screencasting of Selenium sessions, which can be a game-changing characteristic for failure analysis (troubleshooting). Besides, WebDriverManager provides custom monitoring features (through BrowserWatcher), such as seamless console log gathering, Content Security Policy (CSP) disabling, or viewport screencasting (i.e., not record the whole desktop but only the browser viewport).
Rich configuration. One of the key aspects of WebDriverManager is that all these features can be configured through its API, using Java system properties and even environmental variables. These capabilities are very convenient for fine-tuning every feature provided by WebDriverManager, such as automated driver management, docker management, or browser monitoring.
Legacy support. The minimum version for the latest releases of Selenium 4 (i.e., those that have Selenium Manager) is Java 11. If you cannot bump to Java 11 yet (you should, but sometimes it is impossible), you can continue using WebDriverManager since even release 6 is compiled using Java 8.

What’s coming next

I’m devoted to maintaining Selenium Manager and WebDriverManager in the upcoming years. Now I am part of the Selenium Technical Leadership Committee (TLC), so my commitment is to continue improving the browser automation experience for Selenium users. Selenium Manager is still in beta but has already proven stable at this writing so that it will be released as stable in the upcoming Selenium 5. Also, in light of usage numbers, WebDriverManager is still a popular and helpful tool, so I will continue maintaining the project for the Java community.

WebDriverManager 5: Automated driver management and Docker builder for Selenium WebDriver

Boni García — Mon, 13 Sep 2021 12:37:48 GMT

WebDriverManager is an open-source Java library that carries out the management (i.e., download, setup, and maintenance) of the drivers required by Selenium WebDriver (e.g., chromedriver, geckodriver, msedgedriver, etc.) in a fully automated manner.

WebDriverManager enables the development of portable WebDriver tests while reducing the development and maintenance efforts. For instance, the skeleton of a JUnit 5 test using Selenium WebDriver and WebDriverManager is as follows:

https://medium.com/media/fbfaef4c2161a6e116954973ca66a4f9/href

WebDriverManager implements a resolution algorithm for automated driver management. The foundations of this algorithm are:

Browser version discovery.
Driver version match.
Driver and resolution cache.
Export driver path as system property setup.

You can find all the internal details of this resolution algorithm in the paper Automated driver management for Selenium WebDriver, published in the Springer Journal of Empirical Software Engineering in 2021.

WebDriverManager 5: the next generation

WebDriverManager was first released in 2015. Nowadays, it is a well-known helper library for Selenium WebDriver, used in thousands of projects. WebDriverManager 5 was released in 2021. As usual, this release allows the automated driver management for Selenium WebDriver. Moreover, this release provides other features aimed to ease the development of Selenium WebDriver tests.

New documentation

The documentation has been completely rewritten in WebDriverManager 5:

https://bonigarcia.dev/webdrivermanager/

Internally, this site is done in AsciiDoc, and it is generated to HTML, PDF, and EPUB.

Browser finder

As of version 5, WebDriverManager allows detecting if a given browser is installed or not in the local system. To this aim, each manager provides the method getBrowserPath(). This method returns an Optional, which is empty if a given browser is not installed in the system or the browser path (within the optional object) when detected. You can use this feature to skip tests using assumptions, for instance, as follows:

https://medium.com/media/a22bbc69928204c80e09fbd1c44640ea/href

WebDriver builder

WebDriverManager 5 also allows instantiating WebDriver objects (e.g., ChromeDriver, FirefoxDriver, etc.) using the WebDriverManager API. This feature is available using the method create() of each manager. For instance:

https://medium.com/media/beaa816114e2edb33a16ff1cd97ab787/href

Browsers in Docker

Another relevant new feature available in WebDriverManager 5 is the ability to create browsers in Docker containers out of the box. To use it, we need to invoke the method browserInDocker() in conjunction with create() of a given manager. For instance:

https://medium.com/media/7693a671b904b2126b5d7a647fdb8557/href

The usfed Docker images by WebDriverManager have been created and maintained by Aerokube. Therefore, Chrome (desktop and mobile), Firefox, Edge, Opera, and Safari (WebKit engine) are the available browsers to be executed as Docker containers in WebDriverManager. In addition, we can use the beta and development versions of Chrome and Firefox, thanks to a fork of the Aerokube images maintained by Twilio.

https://medium.com/media/bc6937a4f6ca3c91618b803b977abd83/href

WebDriverManager allows connecting to the remote desktop session simply invoking the method enableVnc() of a dockerized browser. In addition, we can use the method enableRecording() to record the browser session.

https://medium.com/media/dd7aad46649200489218514539e0f2ef/href

Other usages

WebDriverManager can be used as a CLI tool, Java agent, or Selenium Server. Take a look at the documentation for further details:

https://bonigarcia.dev/webdrivermanager/

Selenium-Jupiter

WebDriverManager is the foundation tool of Selenium-Jupiter, an open-source JUnit 5 extension for developing Selenium WebDriver tests. Thanks to the parameter resolution provided by JUnit 5, the required boilerplate of a WebDriver test is reduced to the minimum.

https://medium.com/media/9bbcd634ee25b1aa655b4c7efd58b495/href

Selenium-Jupiter also provides seamless integration with Docker, recordings, VNC, and more. Check out the documentation for the complete reference and examples:

https://bonigarcia.dev/selenium-jupiter

https://medium.com/media/983e6a1372a2657e5c8cbbc33a33bd4b/href