Skip to main content

Communication Protocols with the Browser: WebDriver vs Chrome DevTools Protocol

Overview

In the world of automated UI testing, the most popular protocols for controlling the browser are currently WebDriver and Chrome DevTools Protocol (CDP).

WebDriver

WebDriver is a standardized REST API protocol. Browser developers support this protocol in their drivers — chromedriver, geckodriver, etc. — which act as intermediaries (proxies) between the client sending requests and the browsers themselves. Such an intermediary is needed because browsers are written completely differently, and communication between the driver and the browser is not standardized.

Additionally, the WebDriver protocol is used as the base protocol for automating mobile devices on iOS/Android using Appium.

WebDriver Protocol

Pros

  • Official standard supported by all popular browsers;
  • Support for automation on mobile devices;
  • Can be used both on a local machine and remotely;

Cons

  • Out of the box, it does not allow tracking and intercepting network events (mocking requests/responses);
  • Limited set of automation capabilities (e.g., no ability to control network bandwidth or CPU performance): the protocol covers only basic user interaction scenarios with the browser;
  • No ability to subscribe to browser events (e.g., get information from the browser that a new tab has opened);
  • Requires additional setup (installing selenium-standalone, necessary browser drivers, etc.).

CDP

Chrome DevTools Protocol (CDP) is essentially JSON RPC implemented via websockets.

Chrome and Node.js implement APIs for this protocol, which allow communication with DevTools: sending commands, subscribing to events, etc.

This API is used:

  • In Chrome DevTools (the developer panel inside the browser) for debugging and inspecting code;
  • In IDEs (e.g., VSCode) for similar purposes;
  • In various test automation tools: puppeteer, cypress, etc.;
  • For communication between chromedriver and the Chrome browser (in the image above — browser protocol).

The protocol's API is logically divided into domains, which contain methods and can send events.

For example, the Runtime domain allows inspecting the state of JavaScript, and the Debugger domain can be used to debug JavaScript.

Pros

Cons

  • Supports a limited list of browsers: Chrome, Chromium Edge, and Firefox nightly;
  • By default, works only locally (but it is possible to connect to an already running browser on a remote machine).