Skip to main content

action

Overview

Use the action command to perform input actions on one of the virtualizable devices in the web browser.

Unlike high-level commands like scrollIntoView and doubleClick, the Actions API provides more granular control of input devices. The following input sources are available:

  • text input for keyboard devices;
  • control of mouse, pen, or touch device;
  • scroll control for devices with a scroll wheel.

Each chain of action commands must end with a perform command to execute the specified set of actions. This also releases all pressed keys, buttons, etc., on virtual input devices and triggers the necessary events. The release can be skipped by passing the argument true to the perform command. Example:

await browser.action(...).perform(true);
info

Support for this command and specific actions may vary depending on the execution environment. The development progress can be tracked on wpt.fyi. For mobile devices, you can use specialized gesture commands with Appium on iOS and Android.

Keyboard Control

Used when specifying key as the argument for the action command. Example:

await browser.action("key");

Returns a KeyAction object that supports the following actions:

  • down(value: string) — creates a key press action;
  • up(value: string) — creates a key release action;
  • pause(ms: number) — specifies that the input source does nothing for the specified amount of time.

Special Symbols

To use special symbols (Control, Page Up, Shift, etc.), you can use the Key object from the webdriverio package. It contains Unicode representations of all necessary special symbols. Example:

import { Key } from "webdriverio";

await browser.action("key").down(Key.Ctrl).perform();

Usage Examples

import { Key } from "webdriverio";

it("should emit key events using key action commands", async ({ browser }) => {
const elem = await browser.$("input");
await elem.click(); // make the element active

await browser.action("key").down("f").down("o").down("o").up("f").up("o").up("o").perform();

console.log(await elem.getValue()); // returns "foo"

// copy value from input element
await browser.action("key").down(Key.Ctrl).down("c").pause(10).up(Key.Ctrl).up("c").perform();
});

Instead of a series of down/up events, it is better to use the setValue command. The example is purely for demonstrating the principles of the action command.

Pointer Control

Used when specifying pointer as the argument for the action command, and you can also specify the pointer type. Example:

await browser.action("pointer", {
parameters: { pointerType: "mouse" }, // "mouse" is the default value, also available: "pen" or "touch"
});

Returns a PointerAction object that supports the following actions:

  • down(button: 'left' | 'middle' | 'right') — creates an action to press a key;
  • down(params: PointerActionParams) — creates an action to press a key with detailed parameters;
  • move(x: number, y: number) — creates an action to move the pointer by x and y pixels relative to the viewport;
  • move(params: PointerActionMoveParams) — creates an action to move the pointer by x and y pixels from the specified origin. The origin can be defined as the current pointer position, the viewport, or the center of a specific element;
  • up(button: 'left' | 'middle' | 'right') — creates an action to release a key;
  • up(params: PointerActionUpParams) — creates an action to release a key with detailed parameters;
  • cancel() — creates an action that cancels the current pointer position;
  • pause(ms: number) — specifies that the input source does nothing for the specified amount of time.

Detailed information on the parameter types PointerActionParams, PointerActionMoveParams, and PointerActionUpParams can be found in the webdriverio source code.

Usage Examples

it("drag and drop using pointer action command", async ({ browser }) => {
const origin = await browser.$("#source");
const targetOrigin = await browser.$("#target");

await browser
.action("pointer")
.move({ duration: 0, origin, x: 0, y: 0 })
.down({ button: 0 }) // left button
.pause(10)
.move({ duration: 0, origin: targetOrigin })
.up({ button: 0 })
.perform();
});

Scroll Wheel Control

Used when specifying wheel as the argument for the action command. Example:

await browser.action("wheel");

Returns a WheelAction object that supports the following actions:

  • scroll(params: ScrollParams) — scrolls the page to the specified coordinates or origin;
  • pause(ms: number) — specifies that the input source does nothing for the specified amount of time.

Detailed information on the parameter type ScrollParams can be found in the webdriverio source code.

Usage Examples

it("should scroll using wheel action commands", async ({ browser }) => {
console.log(await browser.execute(() => window.scrollY)); // returns 0

await browser
.action("wheel")
.scroll({
deltaX: 0,
deltaY: 500,
duration: 200,
})
.perform();

console.log(await browser.execute(() => window.scrollY)); // returns 500
});

References

We'd like to give credit to the original WebdriverIO docs article, from which we drew some information while writing our version.