Communication Protocols with the Browser: WebDriver vs Chrome DevTools Protocol
Overview
In the world of automated UI testing, the most popular protocols for controlling the browser are currently WebDriver and Chrome DevTools Protocol (CDP).
WebDriver
WebDriver is a standardized REST API protocol. Browser developers support this protocol in their drivers — chromedriver, geckodriver, etc. — which act as intermediaries (proxies) between the client sending requests and the browsers themselves. Such an intermediary is needed because browsers are written completely differently, and communication between the driver and the browser is not standardized.
Additionally, the WebDriver protocol is used as the base protocol for automating mobile devices on iOS/Android using Appium.
Pros
- Official standard supported by all popular browsers;
- Support for automation on mobile devices;
- Can be used both on a local machine and remotely;
Cons
- Out of the box, it does not allow tracking and intercepting network events (mocking requests/responses);
- Limited set of automation capabilities (e.g., no ability to control network bandwidth or CPU performance): the protocol covers only basic user interaction scenarios with the browser;
- No ability to subscribe to browser events (e.g., get information from the browser that a new tab has opened);
- Requires additional setup (installing selenium-standalone, necessary browser drivers, etc.).
CDP
Chrome DevTools Protocol (CDP) is essentially JSON RPC implemented via websockets.
Chrome and Node.js implement APIs for this protocol, which allow communication with DevTools: sending commands, subscribing to events, etc.
This API is used:
- In Chrome DevTools (the developer panel inside the browser) for debugging and inspecting code;
- In IDEs (e.g., VSCode) for similar purposes;
- In various test automation tools: puppeteer, cypress, etc.;
- For communication between chromedriver and the Chrome browser (in the image above —
browser protocol
).
The protocol's API is logically divided into domains, which contain methods and can send events.
For example, the Runtime domain allows inspecting the state of JavaScript, and the Debugger domain can be used to debug JavaScript.
Pros
- Provides more automation capabilities than WebDriver; with CDP you can:
- No need to set up selenium-standalone or browser drivers: just having a local Chrome browser is enough.
Cons
- Supports a limited list of browsers: Chrome, Chromium Edge, and Firefox nightly;
- By default, works only locally (but it is possible to connect to an already running browser on a remote machine).