Introduction

1.1.1 Why Read This Book?

This may sound a little odd…

While WebdriverIO is in this book’s title, this isn’t a book about WebdriverIO.

Yes, it does cover the popular test automation framework in-depth. More importantly though, this book is about teaching you how to effectively use UI Test Automation to validate Web App functionality.

The lessons here focus less on how specific WebdriverIO commands work and more on how specific approaches to testing are beneficial or harmful.

And yet, while any test framework would work for this, I did choose WebdriverIO for a specific reason: It’s a mature framework that allows us to spend less time on code and more time on important testing concepts.

Why do I start off the book saying all of this?

Well, over the past decade, website complexity has grown substantially, requiring much more effort when it comes to testing. While test automation is certainly not a recent development, it has grown in popularity lately due to the increased complexity of the sites we build.

In fact, many organizations have full teams dedicated just to testing their site. Quality Assurance (QA) is an important role for any company that wants to take its technology seriously.

However, having humans manually run through test scripts is a time-consuming task.

By automating our tests, we hope to shift the workload from manual labor to speedy CPUs. Without humans and their need to sleep and eat, we can theoretically test our sites on-demand, around the clock.

Wouldn’t it be ideal to have a test suite so effective that your QA team focused solely on keeping it up-to-date?

Unfortunately, that’s not the reality.

I’ve seen, heard, and have been part of many teams that set out after this ideal, only to realize months later that all the effort has provided them little benefit. Yes, they have test automation in place, but it’s constantly breaking and causing endless headaches. It seems they either have tests that run well but don’t validate much or have tests that check for everything but report false errors.

In 2017, I spent the year recording screencasts covering WebdriverIO in-depth. While I covered the details of the framework quite well (or so I was told), I was left with nagging questions of “is knowing the tool valuable enough?”

Is knowing how to mix colors and hold a paintbrush enough to be able to paint beautiful art?

Does knowing what the ‘addValue’ command does in WebdriverIO teach you enough to write tests that are effective?

This time around, I’m focusing on that second part. Yes, it’s important to cover the details of commands and code. More importantly though, you need to see how you can combine all that technology to create a test suite that provides actual value.

In this book, I cover not just what WebdriverIO can do, but specifically how you’ll be using it day-to-day. I’ve built the examples around real-world scenarios that demonstrate how you would actually set things up. I’m not here to only teach you what WebdriverIO does. I’m here to teach you how to approach problems from various angles and come to the right solution.

It takes a little more work on my part, and extra effort on your’s to get started, but the payoff is there, I promise.

With that, let’s get started.

What is User Interface (UI) Test Automation?

I mentioned that test automation isn’t anything new, but I didn’t explain what it is.

Truthfully, test automation is a lot of things. There are a multitude of programs and tools surrounding it, not to mention the various ideologies about automated testing in general (e.g., Test-driven Development vs. Behavior-driven Development).

Aside from that, there are also different types of tests. In this book, we’re focusing solely on UI automation, but there’s also unit, integration, performance, accessibility, usability, you-name-it testing.

All of these are important to know about, and can provide as much or more value than UI testing. So why focus on UI testing?

Well, you don’t really have an option. You can’t skip accessibility testing if you want to have a site that’s accessible. Equally, you can’t skip UI testing if you want to have a site that’s not broken.

Truthfully, we’re doing tons of UI testing. Every time someone loads up a website, they’re testing the UI. When they click that button, does it make that thing happen? When I use my mobile device, does the site fit on my small screen?

If you have a site in production, every user is testing your UI. Hopefully for them (and you), they’re not the first ones to try it out.

UI testing is constantly being done, just in a very labor-intensive way. Front-end developers working on the code will spend half of their day in the browser manually testing if their changes worked. If they’re testing in an older browser, it will be 3/4ths of their day.

UI testing is the simple (hah) task of validating that the code written for the browser, actually works in that browser, along with all the other components that went into that webpage.

UI test automation is a way to convert those manual mouse clicks and keystrokes into coded scripts that we can run on a regular basis.

Let’s Talk Benefits

I’ve alluded to this before, but it’s worth repeating. Here are some of the many real-world benefits of automated UI testing:

Jamie, your ace front-end developer, just finished their latest task and is itching to release it. While they were careful to validate that the new functionality works, unfortunately they forgot to test whether that new code broke the zip code widget on the contact page. Lucky for you, the UI automation test caught the error, and a fix was in place before the end of the day.
Taylor is an awesome full-stack developer. They’ve got everything about their coding environment fine-tuned so that they can focus solely on pumping out fixes and new features… except for one thing: Taylor always forgets to check their code on slower, outdated computers. That’s okay though, as our UI tests are configured to run on a variety of setups, and it just so happens that it caught that issue when running inside a Windows 8 environment.
Casey is a great product manager, but sometimes forgets to clarify the small details. This means developers can make the wrong assumption during development, which is only discovered by Casey during product demos. By adding automated tests to the mix, developers and product managers are pushed to have regular conversations on the desired behavior of certain functionality, resulting in fewer suprises later on.
Avery is a superb QA tester. With great attention to detail, Avery always thinks of unique ways in which the website would be seen. Unfortunately, manually testing these edge cases takes a fair amount of time to get in place. By adding automated scripts, Avery is able to programmatically generate the needed data for their tests, and allow them to run these scripts on a regular basis.

These are just a few ways test automation can help. I skipped over many other benefits for the sake of time, but there’s one I omitted on purpose. Many folks claim that UI tests allow you to have fewer developers/testers. While that could be an outcome of automation, I don’t think it’s a valuable argument.

In all the scenarios above, having an extra human around wouldn’t have necessarily helped. That’s because human’s have blind spots, and many of them are the same. Teams have collective blindspots and that’s difficult to get around. We naturally tend to focus on what’s in front of us, forgetting about realities outside our own influence.

UI testing isn’t about replacing humans, but rather augmenting their abilities. We use automation to shore up our limitations, allowing us to focus on our strengths.

Developers waste their time testing on hundreds of different devices, when they could instead let a computer do that part of the work. Your QA team wastes its time running through the same test scenario week after week, when instead they could be thinking of new and undiscovered test cases.

UI automation isn’t about replacing humans with machines, but rather giving us more freedom to work better than machines.

And let’s not forget, UI testing requires a fair amount of work. It’s not magic (although it’s okay to convince management it is). This brings me to an important consideration…

There Are Always Drawbacks

How long do you think it takes to get a set of automated tests up and running? A week… maybe two?

Sure, if all you want to do is test that a page loads properly.

But websites are incredibly complex — the number of features we jam on a page grows each day.

Consider a “simple” homepage. Here are some things you’ll want to test on it:

Do all the parts look right on a laptop and desktop computer?
Do all the parts look right on a tablet?
Do all the parts look right on a phone?
Does the site navigation work?
Does the “need help” chatbox pop up after five seconds?
Does the autocomplete in the site searchbar work?
Does the hidden menu show after you click the menu icon?
Does the carousel on the homepage rotate correctly?

Okay, I’ll stop there. Hopefully you get my point that there’s a lot to test even on a single page. A basic script that loads a page doesn’t provide much reassurance.

So if you want to test all your functionality on all your pages, you’re going to have to write a lot of code.

Covering all of that ground takes time. Time that could be spent on other, possibly more important, tasks.

And as you’re writing test after test, the website your testing is going to keep changing. New features and fixes will be continuously introduced. The same feature will be tweaked again and again, causing your once-solid test suite to mysteriously start failing.

When the glorious day comes and you’re “done” writing tests, you still have to maintain them. Now, there are ways to write tests to make them more maintainable, and we’ll cover that throughout the examples, but there’s no such thing as a future-proof test. You’re always going to need to update it.

There’s one more important point to consider.

While you’ll breeze through some parts of test writing, expect to sink 80% of your time figuring out how to test that special 20% of your site’s functionality. There are many areas of writing tests that aren’t trivial. It’s not as simple as calling the command to fill out a textbox and click the submit button.

You’ll need to work with databases, integrate with third-party services and see if you can get commands to work in browsers with poor automation support. You’re also going to run into instances where the way the website was coded just doesn’t jibe with test automation.

Animations are a good example of that. How do you test that an animation works? You can fairly easily check the properties before and after an animation, but what about everything in between those two states?

Honestly, you’d need to do a screen recording of every frame of the animation, and compare that screen recording to a previous run to see if they match. I don’t know about you, but that doesn’t sound easy to me.

Simpler Sites for UI Testing

If you find yourself facing a complex site that would require major work to test properly, there are other options to gain some value without too much effort.

Instead of testing a fully working site, you may want to create a variation of your site that represents portions of the real thing. For example, pattern libraries are a popular option out there, especially for larger websites.

In case you’re not familiar with them, pattern libraries are essentially living demos of the components that make up a website’s interface. Mailchimp has a public pattern library if you’d like to see one in action (https://ux.mailchimp.com/patterns).

For instance, a pattern library may contain standalone versions of the following:

Site layout components like the main navigation and site footer
Components used across multiple pages, like buttons, form inputs, and tabs
Style guidelines for simple page elements like links, headings, and lists

Pattern libraries are very helpful for teams, to document how the website should look and act. If you’re tasked with adding a sortable table to your companies site, having a pattern library with a living example of one makes the job simple.

They’re also useful for testers, as it gives us testable examples of individual components isolated from the complexity of the full website.

Skipping Automation is Sometimes the Best Option

Hopefully I haven’t scared you away from UI test automation entirely. I only wanted to get you thinking about the whole picture.

And that picture should include saying, “Maybe we shouldn’t write test automation for that… at least not yet.”

Let’s consider a few things…

New Features Aren’t User Tested

Business is booming and your team is tasked with a brand-new feature idea that management thinks the customer will love. While it’s tempting to say, “Yes, and let’s write tests to go side-by-side with this new feature,” it might not be the right time.

This brand-new feature has never seen the light of day, and the second a customer sees it there’s going to be something they don’t like about it. If a decision is made to rework the concept based on customer feedback, all those tests you’ve written become useless.

There’s no point in writing a test if you haven’t user-tested your site. So before you spend time writing assertions, ensure the assumptions about the user interactions are initially put to the test.

Time Writing Tests Takes Away from Writing Features

You’re not paid to write tests; tests only serve the application they’re testing. If an app is useless, tests won’t help.

If you’re working on a side project for a tool that no one uses, spending time writing tests takes away from time spent on more important tasks, like getting people to use your work.

Users don’t care whether you have good unit tests. There’s no difference between an unused tool and an unused unit tested tool.

Let yourself have untested code. Worry about that problem when it actually becomes one.

Tests Are Only Valuable When You Use Them

Don’t write more tests when you’re not using the ones you already have.

If you have 500 UI tests, but never put in the time to integrate them in your build and deployment process, you have 500 useless tests. Writing 500 more won’t help.

Your tests should run on every code push. They should run before every deploy. Every developer on the team should see that the tests passed or failed.

If that’s not true, you shouldn’t be writing more tests, you should be taking advantage of the tests you already have.

Parts of the Site Might be Better Tested by People

Remember when I said tests shouldn’t replace people, but rather augment their abilities. Well, in the scenario of testing an animation, we were stuck with a really complex solution. What if we just went with manual visual validation of the effect?

It’s okay to have some parts of your site that are too complex for automation. Grab that low-hanging fruit and leave the stuff higher in the tree for a later time when you have a ladder.

Are Tests Worth It Then?

I’ve outlined the benefits and drawbacks of test automation, including reasons to entirely skip some of it.

So how do you ensure you’re getting more benefits than drawbacks? Focus on these two goals:

Ensure you’re gaining value out of every test you write
Ensure you’re selling that value to those in charge

It’s really easy to get caught up in automation, trying to cover every nook and cranny of the site. But if a feature of your site doesn’t provide much value, how much less value would a test for that feature be?

The site login component breaking… that’s bad. The site’s About page having a typo… not such a big deal. Sure, we’d want to fix it, but it (hopefully) won’t cost the company a lot of money.

By focusing on gaining and selling that value, you can keep yourself honest in your test writing, and focus on what’s important: having a site that’s running smoothly and providing value to the user.

1.1.2 Why Use WebdriverIO?

Back in the late 2000s, I learned of a tool called Selenium that the tester’s on my team were fairly interested in using.

I thought it was a neat idea, but there was one big red flag to me. It required writing the tests in the Java programming language.

I had taken a couple semesters of Java programming in college and actually enjoyed the object-oriented nature of the language. I had to wrap my head around some of the complexities of the language, but overall I found it a useful language to understand.

But thinking about writing automated tests in Java gave me pause. Java is a very verbose language requiring a fair amount of setup and some tedious coding. I just didn’t think test automation was a good fit for it. So I stopped researching the idea in favor of other pending tasks.

Years later, a new tool came out called PhantomJS. It was based in Node.js, and promised the ability to automate browser usage. That definitely perked my interest, and I’ll explain in a minute, but first…

What’s a Node.js?

You may not be familiar with Node.js, so I’ll explain it a little here. If you are familiar, feel free to skip this part.

JavaScript is the coding language of the web. Starting in the mid 1990’s, an early version of JavaScript (originally called LiveScript) was included in the Netscape Navigator browser. Microsoft, eager to match and beat the features of Netscape, saw this new language and decided to add their own version (calling it JScript) to Internet Explorer (IE).

As the browser battle continued, JavaScript and JScript continued to grow in popularity among website authors. I recall my first use of JavaScript was to make a “mouse trail” on my very first website 1. My second use was to float an animation of Ralph Wiggum eating glue across the screen of my “from the local police blotter” page.

It was dumb, but boy was it fun to play around with.

Most JavaScript usage for the next decade revolved around either cheesy browser effects, or useful add-ons like drop-down menus and browser-based form field validation. As a front-end developer, learning JavaScript was an important part of your job, although not a critical part like it is today.

In 2009, Ryan Dahl combined Chrome’s JavaScript engine (called V8) with a few new tricks, and have it run entirely outside the browser 2. While the initial idea was to use JavaScript and Node.js to create servers that could better handle high-traffic sites, developers across the globe saw even more power in the tool.

From 2009 to 2019, Node.js has grown tremendously. While development of the tool did stagnate around 2014, a fork of Node.js called io.js kicked those in charge back into gear, and eventually the two tools were combined to create a better future.

And a better future it has become. Node.js has become one of the most popular programming environments out there, and it’s used for everything from servers to development tools, and even desktop applications.

Back to PhantomJS

PhantomJS’s popularity grew as developers realized they now knew how to write code that would automate a browser. Automated testing immediately came to mind, and a sister-tool called CasperJS was created to compliment PhantomJS.

There was only one problem: PhantomJS was a paired-down version of Google Chrome. Sure, it was fast and had a lot of features of a normal browser, but it wasn’t the browser that site visitors would be using.

You could write all the test automation you wanted, but it still wouldn’t catch bugs that only occur in Internet Explorer. Remember, at that time, many websites still needed to support the bug-ridden versions of that browser (7, 8 and 9).

So PhantomJS’s popularity as a testing tool was always limited. The benefit of automated testing in that single non-traditional browser just never seemed worth the cost.

Enter WebdriverIO

In 2015 I found out about a tool called WebdriverCSS. It was a Visual Regression Testing tool used to compare two screenshots of a page and see if they’re visually different from each other. I had tried many tools like this in the past, but this one was unique.

WebdriverCSS was actually a plugin for a library called WebdriverIO. WebdriverIO came with all the great features of PhantomJS (being able to automate a browser through a Node.js script), but had the added benefit of supporting Selenium.

Remember Selenium? That tool from the mid-2000s that I glossed over because I was a little fearful of Java?

WebdriverIO made Selenium approachable to me, and that was literally life-changing (yes, literally). Since investing myself in WebdriverIO, my actual job duties had shifted from primarily front-end development (writing HTML and CSS) to a focus on front-end testing (writing WebdriverIO test scripts).

There were four selling points that convinced me to use WebdriverIO. But before I list my reasons, how about we hear from some other folks:

“Webdriverio is a comprehensive, well documented project with great coverage of the Selenium/Webdriver/Appium specs, as well as loads of very useful helper abstractions. @christian-bromann has been amazing to work with, providing fantastic support and encouraging a helpful community in general.” - Goerge Crawford, who is now a core contributor on the project

“The framework of choice at Oxford University Press, I have been using webdriverio since 2016 and it has made life so much easier, especially now with v5, hats off to @christian-bromann and his crew who maintain and continualy support it, keep up the good work guys.” - Larry G. - Automation Architect

“Give a QA Engineer a web-automation framework and he might automate some tests. Give him WebdriverIO and he will build a full-fledged, robust automation harness in a matter of days. All jokes aside, WebdriverIO was a blessing in disguise for us @Avira. I’ll never look back to other JS-based automation frameworks!” Dan Chivescu, QA Lead

And what about my reasons for choosing WebdriverIO?

WebdriverIO is “Front-end Friendly”

Unlike most other Selenium tools out there, WebdriverIO is written entirely in JavaScript. It’s also not restricted to just Selenium, as support for the Chrome Devtools protocol (which we’ll talk about later) was added in Version 5.13 of WebdriverIO. This means you can use WebdriverIO without installing Java or running Selenium.

Like I said, I always thought browser automation meant figuring out how to get some complex Java app running. There was also the Selenium IDE, but writing tests through page recordings reminded me too much of WYSIWYG web editors like Microsoft FrontPage (you’ll need to look that up if you weren’t doing web development in the early 2000’s).

Instead, WebdriverIO lets me write in a language I’m familiar with, and integrates with the same testing tools that I use for unit tests (e.g., Mocha).

As a developer, the mental switch from writing the functionality to writing the test code requires minimal effort (since it’s all just JavaScript), and I love that.

The other great thing, and this is more to credit WebDriver than WebdriverIO (there is a difference and we’ll talk about it), is that I can use advanced CSS selectors to find elements.

xPath scares me for no good reason. Something about slashes instead of spaces just chills my bones. But I don’t have to learn xPath.

Using WebdriverIO, I simply pass in my familiar CSS selector and it knows exactly what I’m talking about.

I believe front-end developers should write tests for their own code (both unit and UI), and WebdriverIO makes it incredibly easy.

It Has the Power of Selenium

I always felt held back when writing tests in PhantomJS, knowing that it could never validate functionality in popular, but buggy, browsers like IE.

But because WebdriverIO has built-in support for Selenium, I’m able to run my tests in all sorts of browsers.

Selenium is an incredibly robust platform and an industry leader for running browser automation. WebdriverIO stands on the shoulders of giants by piggy-backing on top of Selenium. All the great things about Selenium are available, without the overhead of writing Java-based tests.

It Strives for Simplicity

The commands you use in your WebdriverIO tests are concise and common sense.

What I mean is that WebdriverIO doesn’t make you write code to connect two parts together that are obviously meant for each other.

For example, if I want to click a button via a normal Selenium script, I have to use two commands. One to get the element and another to click it.

Why? It’s obvious that if I want to click something, I’m going to need to identify it.

WebdriverIO simplifies the ‘click’ command by accepting the element selector right in to the command, then converts that in to the two Selenium actions needed. That means instead of writing this:

Code snippet

driver.findElement(By.id('submit')).click();

I can just write this:

Code snippet

$('#submit').click();

It’s so much less mind-numbing repetition when writing tests…

Speaking of simple, I love how WebdriverIO integrates in to Selenium. Instead of creating its own Selenium implementation, it uses the common REST API that Selenium 2.0 provides.

If you haven’t worked with API endpoints before, this may not make sense. Don’t worry, it’s not necessary to understand. But if you’re interested, here’s how it goes.

WebdriverIO sees that you want to run a command (say “getUrl”). It takes that command and converts it into a request to the Selenium server (it would look like “/session/someSessionIdHere/url”). The Selenium server processes the request and returns the result to WebdriverIO, which then returns the found URL to your code.

Most of WebdriverIO is made up of these small commands living in their own separate small file. This means that updates are easier, and integration into cloud Selenium services like Sauce Labs or BrowserStack are incredibly simple.

Too many tools out there try to reinvent the wheel. I’m glad WebdriverIO keeps it simple and uses what is already out there. This, in turn, helps me easily understand what’s going on behind the scenes.

It’s Easily Extendable/Scalable

As someone who has spent a considerable portion of their career working for large organizations, it’s important to me that the tools I’m using are easily extendable.

I’ll have custom needs and will want to write my own functionality. WebdriverIO does a great job at this in two ways:

Custom Commands

There are many commands available by default via WebdriverIO, but there are times when you want to write a custom command just for your application.

WebdriverIO makes this really easy. Just call the “addCommand” function, and pass in your custom steps.

Here’s an example from their docs:

Code snippet

browser.addCommand('getUrlAndTitle',function(){	// `this` refers to the `browser` scope	return{		url:this.getUrl(),		title:this.getTitle()	};});

Now, any time I want both the URL and title in my test, I’ve got a single command available to get that data.

Code snippet

browser.url('http://www.github.com');constresult=browser.getUrlAndTitle();

Page Objects

With the 4.x release of WebdriverIO, they introduced a new pattern for writing Page Objects. For those unfamiliar with the term, Page Objects are a way of representing interactions with a page or component.

Rather than repeating the same selector across your entire test suite for a common page element, you can write a Page Object to reference that component.

Then, in your tests, request what you need from the Page Object and it handles it for you. This helps your tests be more maintainable and easier to read.

They’re more maintainable because updating selectors and actions occur in a single file.

When a simple HTML change to the login page breaks half your tests, you don’t have to find every reference to input[id="username"] in your code. You only have to update the Login Page Object and you’re ready to go again.

They’re easier to read because tests become less about the specific implementation of a page and more about what the page does.

For example, say we need to log in to our website for most of our tests. Without Page Objects, all the tests would begin with:

Code snippet

browser.url('login-page');browser.setValue('#username', 'testuser');browser.setValue('#password', 'hunter2');browser.click('#login-btn');

With Page Objects, that can become as simple as:

Code snippet

LoginPage.open();LoginPage.login('testuser', 'hunter2');

No reference to specific selectors. No knowledge of URLs. Just self-documenting steps that read out more like instructions than code.

Now, Page Objects aren’t a new idea that WebdriverIO introduced. But they way they’ve set it up to use plain JavaScript objects is brilliant. There is no external library or custom domain language to understand. It’s just JavaScript and a little bit of prototypical inheritance. (We’ll definitely cover Page Objects in more detail later in this book.)

Summing It Up

I wouldn’t call myself a real software tester. I’m far too clumsy to be put in charge of ensuring a bug-free launch.

Yet, I can’t help but love what WebdriverIO provides me, and I’m a fan of what’s going on with the project and its future. Hopefully this book helps you feel the same way.

1.1.3 Technical Details

Versions

Because technology changes fast, it’s good to cover what versions of software I used when creating these exercises.

Node.js: v12.16.1
WebdriverIO: 6.3.5
Java 8 (optional)

Git Repository

To download code samples for the main part of the book, visit the official git repo.

Where to check for updates/corrections

I’ve written the book using the most recent version of these technologies as I could. However, with an ever changing landscape, updates will need to be made.

You can see a list of changes by visiting the book’s changelog.

Where to find help

I’ve done my best to make the material clear and understandable, but I’m sure to have fallen short in some areas. To get extra help, try one of these three options:

This Book’s GitHub Issues Page
Gitter - WebdriverIO runs an official chat room for folks seeking help with the tool itself.
My personal email - Although I’m usually slow to respond, you can reach out to me at kevin at learnwebdriverio dot com. I’ll do my best to get back to you in a timely manner.

Errata

I’ve worked to ensure that the content of this book is accurate and that the examples actally run. However, I definitely can make mistakes (that’s why I’m a fan of testing after all). Also, technology changes, and what worked when I wrote this doesn’t necessarily work anymore.

If you find errata, please submit an issue on the GitHub repository and I’ll work to get the error resolved.

Technical Knowledge Requirements

What sort of skill level do you need in order to understand the material covered in this book?

While I’ve worked to explain the many concepts introduced through UI testing, I do make the assumption that readers are familiar with the following technologies:

HTML (basic understanding of how HTML structure is composed)
CSS (basic understanding of how CSS selectors work)
JavaScript
NodeJS
Terminal/Shell commands

In regards to JavaScript and NodeJS, here are some important concepts to understand:

Data types: strings, objects, arrays…
Functions: how to call and create them
Conditionals: if/else, ternary
ES6 updates:
- const & let
- Array functions: map, forEach
- Classes (we will cover this in more detail in the page objects section)

That said, you don’t need to know how to create a Node.js server, build a website or use the latest JavaScript framework.

Where can I freshen up?

While I won’t be covering it in this book, here’s a few free resources you might find helpful if you’d like to refresh your knowledge:

1.2 Installation and Configuration

1.2.1 Software Requirements

While WebdriverIO is a Node.js based system, there are a few other tools needed to run the tests. You’ll want:

A recent version of Node.js (8+)
A text-editor (I use Sublime Text 3, but Atom, Webstorm and VSCode are other great options)
A terminal/command line tool (I use iTerm with Oh My Zsh thrown on top)
A Webdriver-compliant browser for testing (Chrome is what we’ll be using)

Optionally, you may also want to install Java 8. This will allow you to run selenium-standalone, which gives you the ability to test on different browsers in the same test run (e.g., both Firefox AND Chrome).

Installing Node.js

There are many great tutorials for how to install Node.js on a variety of systems. A quick search should bring up many results should you need additional help with this installation.

Overall though, there are two common ways to install Node.js.

Install via official site:

Go to nodejs.org and download the release labelled “Recommended For Most Users”. This will start with an even number (e.g., 10.19.0). Be aware that releases starting with odd numbers (e.g., 11.10.0) are not supported long term, so while they may have the latest features, they will stop recieving support and updates after 6 months. For more information on this, have a read through the Node.js release plan.

Install via a ‘version manager’

The main reason for using a version manager is “the future”. In “the future”, you’re probably going to want to update your version of Node.js to a more recent release. While it’s possible to manually uninstall the old version, then install the new one using the official site, it can be a little tedious to do so on a regular basis.

With a version manager, it takes care of this for you. You simply ask for the Node.js version you want, and it does all the grunt work.

Two popular version managers are:

NVM (this is what I use)
N

Installation instructions are on both of those sites, so I won’t copy them over here (plus any copied instructions are likely to be out-of-date by the time you read this.)

Getting Your Terminal Ready

As I mentioned, you’ll need to know the basics of how to use a terminal/command prompt in order to take advantage of all the WebdriverIO has to offer.

All major Operating Systems provide a pre-installed terminal for you to use. These are:

Windows 10: cmd.exe or Powershell
Mac OSX: Terminal
Linux: konsole, gnome-terminal, terminal or xterm

In the terminal of your choice, ensure you have Node.js installed correctly by running node -v in it. This should output the version number of Node.js that you have installed. If you see a message like command not found: node, then something went wrong with your installation and you’ll need to debug it.

A Note for Windows Users

The commands you use in the default Windows terminal (cmd.exe) are different from what I’ll be showing in my code samples.

Some examples:

Instead of using ls to print the contents of a directory, you need to use dir
Windows uses a back slash \ instead of a forward slash / for path commands (e.g., node_modules\.bin\ versus node_modules/.bin/)
Windows users are also required to enter .\ before every function call that involes a path (e.g., dir .\node_modules\.bin\ instead of ls node_modules/.bin/)
The way you define environment variables is different (we’ll go into detail on this later)

For a more comprehensive list of differences, RedHat has put together a comparison chart.

I’ll try to provide the Windows equivalent the first time I introduce a command. However, if you’d like to stick with the commands that I use throughout the book, consider installing an alternative console. Here are some suggestions:

These terminals use bash-style commands, which is what I use in my examples.

1.2.2 Browsers and “Driving” Them

We normally use browsers by clicking with our mouse and typing with our keyboard. That works well for humans, but doesn’t make sense when trying to write automated tests.

Instead of building some sort of physical robot that can control a mouse and type on a keyboard, we invented software that mimics these actions. Selenium RC was one of the original tools to do this. WebDriver, which was also developed around the same time as Selenium RC, became a popular alternative. In 2009, the two teams combined forces to create Selenium WebDriver.

Over the years, standardization on the Selenium WebDriver commands occurred, and now there is an official W3C spec for WebDriver. The teams behind the browsers we use have also started to implement that spec (e.g., ChromeDriver), allowing the use of WebDriver commands outside of Selenium.

Recently, Chrome has released support for their own protocol called “Chrome DevTools”. WebdriverIO has added support for this protocol through the devtools package. The industry has evolved its tooling over the years and WebdriverIO has kept up giving you the flexibility to pick what works best.

This is why WebdriverIO has the tagline “Next-gen browser and mobile automation test framework for Node.js,” excluding any specific protocol. While you can use Selenium in your WebdriverIO tests, it’s really just about running commands through any protocol with support. WebdriverIO doesn’t want to box you in to a specific solution, and we appreciate that :)

Now, it’s important not to confuse terms, so to be clear, the following list contains many different things:

WebDriver: A technical specification defining how tools should work.
The Selenium Project: An organization providing tools used for automated testing.
Selenium/Selenium WebDriver: Language-specific bindings for the WebDriver spec that are officially supported by the Selenium project, like the NPM package selenium-webdriver.
Browser Driver: Browser specific implementations of the WebDriver spec (e.g., ChromeDriver, GeckoDriver, etc).
Selenium Server: A proxy server used to assist a variety of browser drivers.
Chrome Devtools Protocol: A protocol that allows for tools to instrument, inspect, debug and profile Chromium, Chrome and other Blink-based browsers. (Project Homepage)
Puppeteer A Node.js library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol.
WebdriverIO: A test framework written in Node.js that provides bindings for tools like Selenium Server, Chrome DevTools (via Puppeteer) and WebDriver-based browser drivers (e.g., ChromeDriver).

That’s a fair number of terms to keep in mind. I don’t have a great suggestion for how to memorize everything, but maybe just reference this section when you need a good reminder.

What Do We Use?

Right now there are essentially two different approaches to how you can automate a browser. One uses the official W3C web standard (i.e., WebDriver) and the other uses native browser interfaces that some of the browsers expose (e.g., Chrome DevTools).

The WebDriver protocol is the de-facto standard automation technique. It allows you to not only automate all desktop browsers, but also run automation on mobile devices, desktop applications or even Smart TVs. This gives us a tremendous amount of power in being able to run our tests across a variety of systems.

On the other side of things, that are many native browser interfaces to run automation on. In the past, every browser had its own (often not documented) protocol. But these days a lot of browsers, including Chrome, Edge and soon Firefox, come with a somewhat unified interface revolving around the Chrome DevTools Protocol.

While WebDriver provides true cross browser support and allows you to run tests on a large scale in the cloud using vendors like Sauce Labs, native browser interfaces often allow many more automation capabilities like listening and interacting with network or DOM events while often being limited to a single browser only. These native interfaces also run much faster than their WebDriver counterparts, as they’re a bit “closer to the metal”.

We’re going to take a minute to look at how to get set up with a few of these solutions. Throughout the book though, our examples with use the WebDriver protocol, since it’s the most popular standard in use as of this writing. Thankfully though, it’s very easy to switch between protocols in WebdriverIO, so we’re not boxing ourselves in by picking one or the other.

Using the Chrome DevTools Protocol

Starting with Version 6, WebdriverIO now provides support for the Chrome DevTools protocol by default. This means that to run a local test script, you don’t need to download a driver or Selenium. When running your test, WebdriverIO will first check if a browser driver is running and available. If not, it falls back to using Puppeteer (assuming you have a Chromium, Chrome or other Blink-based browser installed). Seeing that Chrome is the most popular browser in use as of the writing of this book, chances are you already have it installed.

To use the Chrome DevTools protocol for your tests, simply ensure you have Chrome (or an equivalent) installed. Everything else is handled by default.

How To Use a ‘Driver’?

If you are interested running your tests for a browser that isn’t based on the ‘Blink’ engine (or just prefer to stick with the WebDriver standard), you’ll want to use some sort of WebDriver-based browser driver. There are several WebDriver clients available, Selenium Server being the most popular. Let’s walk through setting up one of these clients so that you can start writing tests.

All major browsers have ‘drivers’ that mostly follow the WebDriver spec (unfortunately there are still differences between them).

Here are the drivers for each major browser:

GeckoDriver for Firefox (v48 and above)
ChromeDriver for Chromium
EdgeDriver for Microsoft Edge
SafariDriver for Safari (implemented as a Safari browser extension)
IEDriver for Internet Explorer

To see how your favorite browser driver stacks up in regards to WebDriver support, check out the Web Platform Tests page. This site runs regular tests against clients implementing the WebDriver spec, and provides the results showing how well they support it.

There are drivers available for mobile testing (e.g., Appium), but they won’t be covered in this book.

Installing and Running ChromeDriver

Installation instructions for these clients can be found on their respective websites, but in many cases, you can search on npmjs.org for Node.js-based installation tools.

For example, you can download and install ChromeDriver using the NPM ChromeDriver package.

To install the NPM package, run

Code snippet

npm install -g chromedriver

Terminal Output from installing ChromeDriver Globally

You can then start a ChromeDriver instance by running:

Code snippet

chromedriver

Terminal Output from manually running ChromeDriver

This instance will continue to run until you stop it. To do that, issue an ‘exit’ command by pressing the ctrl+c key combo.

Installing and Running the Selenium Standalone Server

First off, if you’re going to be using this method, you need to ensure you have a recent version of Java installed on your computer. Be sure to take care of that before trying the following. None of the content of this book requires a Selenium instance, so feel free to skip this section.

If you’re looking to run tests on a variety of browsers, you’ll probably want to check out what the Selenium Server project does. It offers a ‘hub’ that allows you to start multiple browser instances and control them all through one single location.

While it is possible to manually download and start a selenium server, there is an NPM tool called “ selenium-standalone ” that makes this much easier.

To install and use it, run the following command in your terminal:

Code snippet

npm i -g selenium-standalone

This will make a global command available called selenium-standalone. With this command, we can do the following:

Install the four supported WebDriver clients (ChromeDriver, FirefoxDriver, IEDriver, Microsoft Edge Driver)
Start a Selenium Server that acts as a proxy to these clients

To run the install, issue this command:

Code snippet

selenium-standalone install

Terminal Output from running selenium-standalone 'install' command — Terminal Output from running selenium-standalone ‘install’ command

You should only need to do this once (although you may need to run it again after driver updates occur).

Then, to start your server, run:

Code snippet

selenium-standalone start

Terminal Output from running selenium-standalone 'start' command — Terminal Output from running selenium-standalone ‘start’ command

This server will run until it receives an exit command (similar to how ChromeDriver works). You can issue that command with the ctrl+c key combo.

We’ll talk more about using Chrome DevTools, the Selenium Standalone Server and ChromeDriver (including services to integrate them with WebdriverIO) in a little bit.

1.2.3 Installing WebdriverIO and Basic Usage

The time has finally come! We’ve laid all the groundwork to understand the nuts and bolts behind UI testing. Now it’s time to do some!

To start off, we’re going to create a new folder for our first example. In a directory of your choice, make a new folder called wdio-standalone:

Code snippet

mkdir wdio-standalone

Why wdio-standalone?

Well, WebdriverIO allows you to use it through two modes. The first, which we’re going through here, is called “standalone” mode. It’s meant as a simple way to use WebdriverIO, and allows you to build wrappers around the tool.

“Testrunner” mode, which we’ll cover in the next section, is a bit more complicated. It provides an entire set of tools and hooks for full-fledged integration testing. I mentioned that standalone mode allows you to build wrappers around it. Well, the testrunner is essentially that.

Right now, just to introduce you to WebdriverIO, we’re going to use the standalone runner. This is only for this exercise though, and we’ll be upgrading to the testrunner soon.

With all that said, let’s ‘move’ our terminal into this wdio-standalone folder:

Code snippet

cd wdio-standalone

(For Windows, it’s the same command for both actions)

Inside our new folder, we’re going to initialize it as an NPM project. This will allow us to save the project dependencies that we’ll be installing through NPM.

To do that, run:

Code snippet

npm init -y

The -y will answer ‘yes’ to all the prompts, giving us a standard NPM project. Feel free to omit the -y if you’d like to specify your project details.

With that out of the way, let’s install WebdriverIO:

Code snippet

npm install webdriverio

Now is a good time to mention that WebdriverIO is split into multiple NPM packages. We’ll be looking at those packages in detail later on, but note that installing webdriverio via the command above does not give you everything.

What it does give us is a Node.js module that we can use inside of a Node.js file. Let’s use that.

First, we’ll create a new file called ‘test.js`:

Code snippet

touch test.js

On Windows, that command is:

Code snippet

type nul > test.js

Now we have a file to add our first test to. Go ahead and open that file up in the text editor of your choice.

Next, we’ll copy the example given on the official WebdriverIO website. Throw the following code into your test.js file and save it:

test.js

Code snippet

const{ remote } = require('webdriverio');
(async () => {	const browser = await remote({		capabilities:{			browserName: 'chrome'		}	});	await browser.url('https://webdriver.io');	const title = await browser.getTitle();	console.log('Title was: '+ title);
	await browser.deleteSession();})().catch((e) => console.error(e));

Here’s a quick overview of the file:

We load the remote object from the WebdriverIO package.
We wrap our code in an async function so we can use await statements.
We create a new session using remote, saving the reference to a browser object which we use to send commands.
We send a url command, requesting the browser go to the WebdriverIO website.
We then get the title of the page, storing it as a local variable.
The title of the page is logged to the terminal.
The session is ended, since we’re done with our test.
A simple catch statement is added in case anything goes wrong.

Okay, that’s what it does; let’s run it to see it in action.

To do that, we need to have a browser available to running. This can be through the built-in support from a Chrome install, through a specific browser driver, or through Selenium server.

In “Browsers and ‘Driving’ Them”, I detailed Chrome DevTools, along with how to install and run ChromeDriver and the selenium-standalone NPM package. Now let’s put that knowledge to use.

Running Through Chrome DevTools

So long as you have Chrome (or a Blink-based browser installed), there’s really nothing you need to do here for installation/start-up. All you need to do is run your test file through the node CLI. We do that by telling Node.js to execute our test file. That command looks like:

Code snippet

node test.js

After a second, you should see a Chrome browser pop-up for a moment, and some similar output in your terminal:

Code snippet

2020-07-24T21:03:30.968Z INFO webdriverio: Initiate new session using the devtools protocol2020-07-24T21:03:30.974Z INFO devtools: Launch Google Chrome with flags:--disable-extensions --disable-background-networking --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-sync --metrics-recording-only --disable-default-apps --mute-audio --no-first-run --disable-hang-monitor --disable-prompt-on-repost --disable-client-side-phishing-detection --password-store=basic --use-mock-keychain --disable-component-extensions-with-background-pages --disable-breakpad --disable-dev-shm-usage --disable-ipc-flooding-protection --disable-renderer-backgrounding --enable-features=NetworkService,NetworkServiceInProcess --disable-features =site-per-process,TranslateUI,BlinkGenPropertyTrees --window-position =0,0 --window-size=1200,9002020-07-24T21:03:31.535Z INFO devtools: Connect Puppeteer with browser on port 502102020-07-24T21:03:32.772Z INFO devtools: COMMAND navigateTo ( "https://webdriver.io/" )2020-07-24T21:03:35.366Z INFO devtools: RESULT null2020-07-24T21:03:35.373Z INFO devtools: COMMAND getTitle ()2020-07-24T21:03:35.377Z INFO devtools: RESULT WebdriverIO · Next-gen browser and mobile automationtest framework for Node.js Title was: WebdriverIO · Next-gen browser and mobile automation test framework for Node.js2020-07-24T21:03:35.380Z INFO devtools: COMMAND deleteSession ()2020-07-24T21:03:35.382Z INFO devtools: RESULT null

Congrats, you’ve just run your first WebdriverIO test!

Notice that the first line says “Initiate new session using the devtools protocol”. That will change depending on which protocol you use.

If you choose to go with the DevTools protocol, support for the various commands does differ from WebDriver. While in general everything is supported the same, there are still differences which can cause hiccups along the way. The book is written with support for the WebDriver protocol, so if you choose to stick with the DevTools protocol, expect some differences.

Now let’s look at running via ChromeDriver.

Running in ChromeDriver

The basic idea is the same, although we do need to tweak our settings just a little bit.

This is a little technical, but by default, a ChromeDriver server uses port 9515 to listens for commands (e.g., http://localhost:9515)

But by default, WebdriverIO expects the WebDriver server to be running on port 4444.

So, we can either override the WebdriverIO defaults, or tell our ChromeDriver server to use port 4444.

It’s most useful to see how to overwrite the WebdriverIO defaults, so let’s do that next. If you are interested, you can do the latter by running chromedriver --port=4444 when starting the ChromeDriver server.

Back in your test.js file, take a look at lines 4-8:

Code snippet

const browser = await remote({	capabilities: { 		browserName: 'chrome' 	} });

What we’re doing here is creating a new remote WebDriver session and telling it that we want to open up the ‘chrome’ browser. We’ll get into capabilities at a later point, so don’t worry too much about it right now.

What we will worry about is how to customize that ‘remote’ session to use post 9515 instead of the default 4444.

Along with customizing the capabilities, there are a number of other options available to us. The official documentation gives the entire list, but there are three options that we’re going to change:

hostname Host of your driver server. Type: String Default: localhost

port Port your driver server is on. Type: Number Default: 4444

path Path to driver server endpoint. Type: String Default: /wd/hub

We’re only going to be looking at the port option right now, but I wanted to mention all three as they’re related and important to know about (which we’ll see when we get to the Selenium Standalone instructions.)

So to use a custom port, we pass it in as an option to the remote function:

Code snippet

const browser = await remote({	port: 9515 ,	capabilities: { 		browserName: 'chrome' 	} });

Note that it’s a number, not a string (i.e., 9515 versus '9515'). If you try using a string, you will get an error of Error: Expected option "port" to be type of number but was string.

If you still have your ChromeDriver instance running from before, leave it up and running (you can check http://localhost:9515/ to see if it gives you a response). If not, start an instance in a separate terminal window.

With our WebdriverIO settings updated and ChromeDriver ready to go, we can call our test script again.

Run node test.js one more time and validate that it all works as expected. The output should be similar to before:

Code snippet

2020-07-18T15:29:43.175Z INFO webdriverio: Initiate new session using the webdriver protocol 2020 -07-18T15:29:43.183Z INFO webdriver: [ POST ] http://localhost:9515/session 2020 -07-18T15:29:43.183Z INFO webdriver: DATA { capabilities: { alwaysMatch: { browserName: 'chrome' } ,firstMatch: [ {} ] } , desiredCapabilities: { browserName: 'chrome' } } 2020-07-18T15:29:46.174Z INFO webdriver: COMMAND navigateTo ( "https://webdriver.io/" ) 2020 -07-18T15:29:46.175Z INFO webdriver: [ POST ] http://localhost:9515/session/8c1bfcbb0617b87676343fe9c658fc93/url 2020 -07-18T15:29:46.175Z INFO webdriver: DATA { url: 'https://webdriver.io/' } 2020 -07-18T15:29:48.210Z INFO webdriver: COMMAND getTitle () 2020 -07-18T15:29:48.210Z INFO webdriver: [ GET ] http://localhost:9515/session/8c1bfcbb0617b87676343fe9c658fc93/title 2020 -07-18T15:29:48.524Z INFO webdriver: RESULT WebdriverIO · Next-gen browser and mobile automation test framework for Node.js Title was: WebdriverIO · Next-gen browser and mobile automation test framework for Node.js 2020 -07-18T15:29:48.525Z INFO webdriver: COMMAND deleteSession () 2020 -07-18T15:29:48.525Z INFO webdriver: [ DELETE ] http://localhost:9515/session/8c1bfcbb0617b87676343fe9c658fc93

While there are a few differences from before, the important one is the first line. See how it says it’s using the WebDriver protocol? That’s how we can know things are working as we want.

Running Through Selenium Standalone

Note

Selenium Standalone is not the same thing as WebdriverIO Standalone mode. They simply share the same name to describe their “independent” nature.

If you already have your Selenium server running from before, great! If not, open up a new terminal window and run selenium-standalone start.

Aside from seeing the server running in your terminal, you can check that you have a Selenium instance up and running by visiting the following URL in your browser: http://localhost:4444/wd/hub

You should see a website looking a lot like this:

Preview of Selenium Standalone 'hub' page — Preview of Selenium Standalone ‘hub’ page

Note

If you get a 404 error, something went wrong while starting your server, and you’ll need to resolve it before proceeding.

The next thing we need to do is configure WebdriverIO to use the Selenium server.

Unlike ChromeDriver, when you start Selenium, it runs on port 4444 by default. That means we can comment out or remove the port option we had for our ChromeDriver usage.

That said, Selenium waits for requests to come through the /wd/hub URL endpoint/path (hence http://localhost:4444/wd/hub being mentioned before). But if you recall from our options, WebdriverIO doesn’t have that at it’s default path option (which is just /).

To use Selenium, we’ll need to update that path setting to match where Selenium defaults to:

Code snippet

const browser = await remote({	path: '/wd/hub',	capabilities: {		browserName: 'chrome' 	} });

Running our test once more with node test.js, you should see similar output:

Code snippet

2020-07-18T15:33:51.348Z INFO webdriverio: Initiate new session using the webdri ver protocol 2020-07-18T15:33:51.356Z INFO webdriver: [ POST ] http://localhost:4444/wd/hub/ses sion 2020-07-18T15:33:51.356Z INFO webdriver: DATA { capabilities: { alwaysMatch: { browserName: 'chrome' } , firstMatch: [ {} ] } , desiredCapabilities: { browserName: 'chrome' } } 2020-07-18T15:33:54.807Z INFO webdriver: COMMAND navigateTo ( "https://webdriver.i o/" ) 2020-07-18T15:33:54.807Z INFO webdriver: COMMAND navigateTo ( "https://webdriver.i o/" ) 2020-07-18T15:33:54.808Z INFO webdriver: [ POST ] http://localhost:4444/wd/hub/ses sion/8e71129f6b0b4cb3cd09bd17b06bd6ca/url 2020-07-18T15:33:54.808Z INFO webdriver: DATA { url: 'https://webdriver.io/' } 2020-07-18T15:33:57.275Z INFO webdriver: COMMAND getTitle () 2020-07-18T15:33:57.275Z INFO webdriver: [ GET ] http://localhost:4444/wd/hub/sess ion/8e71129f6b0b4cb3cd09bd17b06bd6ca/title 2020-07-18T15:33:57.288Z INFO webdriver: RESULT WebdriverIO ·Next-gen browser a nd mobile automation test framework for Node.js Title was: WebdriverIO ·Next-gen browser and mobile automation test framework f or Node.js 2020-07-18T15:33:57.289Z INFO webdriver: COMMAND deleteSession () 2020-07-18T15:33:57.289Z INFO webdriver: [ DELETE ] http://localhost:4444/wd/hub/session/8e71129f6b0b4cb3cd09bd17b06bd6ca

Again, line one shows that we’re using the WebDriver protocol. And notice on line two that it posts to http://localhost:4444/wd/hub/session, using the path we provided.

If you instead of all that output you see an error that includes RequestError: connect ECONNREFUSED 127.0.0.1:4444, this means your Selenium server wasn’t running. Start it back up and try again.

Leaving It at That

This will be the end of our little test file. We’re not going to be updating it anymore, and will in fact be leaving this whole wdio-standalone folder behind.

Why? Because we’re moving on to a much better way of using WebdriverIO through its test runner. That’s coming up next.

1.2.4 Upgrading to the WDIO Test Runner

There are a plethora of test automation tutorials out there. Many of them cover the basics of installing and running a specific tool, but lack the depth to teach you how to test on a larger scale.

Here are several things that most tutorials omit (mostly for the sake of time):

How to manage settings across files
How to integrate your tests with other tools
How to extend these test tools with new features
How to debug tests when they inevitably fail

You probably wouldn’t think about these questions when first starting out (I know I didn’t). It isn’t until you’ve invested a fair amount of time that you’re going to start feeling some pain from managing all your tests.

Unfortunately, that pain can stop you from growing your test suite. Instead of writing new tests, you’re stuck writing functionality getting everything working together smoothly.

This is where WebdriverIO goes above and beyond many other test tools. Instead of just being a Node.js tool for running test commands, it provides over 25 different packages that fit specific needs.

Looking for a way to organize your code in a test framework like Mocha or Jasmine? That’s already set up for you.

Want to integrate your tests with a third-party cloud testing platform like Sauce Labs or BrowserStack? WebdriverIO has services for that.

Need to output the results of your tests through a reporter like Allure or Sumo Logic? There are packages for that as well.

WebdriverIO does all of this through a tool it calls the ‘test runner.’

Instead of using WebdriverIO in ‘standalone’ mode, we run everything through the test runner, which integrates all the various moving parts together.

Standalone mode (how we wrote our first test) isn’t bad; in fact, the test runner uses the standalone package to run everything. It’s just that the test runner adds a great deal of functionality on top of it.

It also simplifies a lot of the complexity introduced when trying to solve the problems I listed above.

Here are some other benefits of the test runner:

Implicit async/await usage: WebDriver commands are ‘asynchronous’ by nature. Your code needs a way to handle that. In JavaScript, you do that with the async & await keywords. Unfortunately, that concept can be a bit complex to understand. But if you aren’t familiar with the terms async & await, that’s perfectly fine, the test runner allows you to not worry about it. This lowers the barrier to entry in writing your first real test (it’s one less thing to learn). And we will cover what async/await means at a later point in time.
Extra hooks: Hooks allow you to run code during specific parts of your test flow. Fourteen separate hooks let you integrate code into different parts of your test run, giving you full control on how things execute.
Parallel test runs: UI tests can certainly be slow. By running your tests in parallel (i.e., multiple tests at one time), you’re able to reduce the time needed to run your test suite. The test runner takes care of the details.
Suites: Chances are, on a site large enough, you’re going to have multiple groups of functionality to test. With suites, you’re able to divide sections of your test suite and run them individually. This is especially useful when debugging or developing new tests.

For all those reasons, and many more, the remaining portion of our examples will be carried out through the test runner. It makes the most sense for the majority of users.

Let’s Get Going

So, how do we go about using the test runner?

Well, let’s set up a new folder for writing our tests in. Outside the wdio-standlone folder you created, make a new one called wdio-testrunner:

Code snippet

mkdir wdio-testrunner && cd wdio-testrunner

From inside that folder, initialize a new NPM project:

Code snippet

npm init -y

That will create a package.json file that we’ll use for saving dependencies (among other things).

The test runner is part of the WDIO CLI package. CLI stands for “Command Line Interface,” which is a way of using a program controlled by text input in the terminal.

We install this interface using NPM:

Code snippet

npm install @wdio/cli

All the official WebdriverIO packages are namespaced under the @wdio scope. This makes it easier to distinguish between officially supported packages and third-party work contributed by the WebdriverIO community (the webdriver and webdriverio packages are officially supported, even though they don’t use the @wdio namespace).

One thing we won’t do is install the webdriverio package as we did before. That’s because this package is included when installing the CLI package. Unless you’re doing very custom work, you likely won’t need to install the main webdriverio package. The CLI tool is a wrapper around the main webdriverio functionality, so it’s your new go-to tool for using the overall WebdriverIO tool suite.

Okay, enough about that. Let’s get on with using this fancy new CLI tool.

Having installed the tool, you now have access to the wdio command in your terminal. This command lets you do a few things:

Run a setup utility to create a common configuration file
Run test suites using said configuration files
Run commands through a REPL, which is an interactive terminal utility for running one-off WebdriverIO commands

We’ll take a look at the third option later, but for now, let’s check out the first two.

Setting Up Our Configuration

In our original example, we defined a few options when initializing our WebdriverIO session. As a reminder, the code looked like:

Code snippet

const browser = await remote({	// port: 9515, // used for chromedriver 	// path: '/wd/hub', 	// used for selenium standalone 	capabilities : { 		browserName : 'chrome' 	} });

All of the information inside the remote function call has to do with configurations on how we want our WebDriver session to be set up. If we were to stick with using standalone mode, we’d need to either duplicate this configuration across all of our test files, or figure out a way to share it through a common file.

Well, the test runner figures you’d probably want to go the route of sharing the configuration, and takes the step to do that work for you. By convention (meaning you can override this), your configurations will be stored inside a file called wdio.conf.js. Using the configurations we set above, that file may look like this:

Code snippet

exports.config = {	// port: 9515,	// used for chromedriver	// path: '/wd/hub', 	// used for selenium standalone 	capabilities: {		browserName: 'chrome' 	} };

The main difference from before is the exports.config part. That gives us the ability to share our configuration for all our tests.

One thing missing from the previous code sample is that there are a lot more options in the wdio.conf.js file. The test runner provides support for many different utilities (e.g., custom reporters and services), so the file is usually over 200 lines long. That’s a lot of options to set!

To help you with this, a configuration utility was provided through the CLI interface. This utility is run either when you specifically ask for it using wdio config, or if you try to run the wdio command and it can’t find an existing configuration file.

That’s enough talk, let’s give it a shot.

Since @wdio/cli is a command-line utility, it’s set up to install into a special .bin folder inside our node_modules folder. This gives us two ways to run the utility:

Code snippet

./node_modules/.bin/wdio

Code snippet

npx wdio

Either way works, and they both do basically the same thing. Personally, I prefer using the npx method, as it’s a bit easier to type. I’ll be giving my examples using the second way.

If you haven’t done so already, run either of those two commands we just looked at.

Even though we’re not passing in the config flag to that command (although you certain can), it will notice that we don’t have a wdio.conf.js file available and ask to start the config utility for us (? Error: Could not execute "run" due to missing configuration. Would you like to create one? (y/N)).

Enter y and press enter, and you should see a screen like this:

Terminal Output from running `npx wdio` command with no config file available

This configuration utility will ask you a series of questions in regards to how you want to run your tests.

Stepping Through the Configuration Utility

Important: This section was written using the latest version of the configuration utility. However, this utility is regularily updated and the questions asked are frequently changed/improved based on feedback. This means that what’s written next may not match exactly what you see. I’ll try and keep this section up-to-date, but expect to see some difference.

The first question the utility asks us is Where is your automation backend located?. ‘Automation Backend’ refers to the computer that hosts either your Chrome DevTools-capable browser, or your WebDriver server.

There are many instances where you’d want to customize this, but for us, we’re going to be running it locally using Chrome DevTools, so we’ll stick with the first choice and hit ‘enter’ on ❯ On my local machine.

Next up, it asks Which framework do you want to use?. We have three options: Mocha, Jasmine and Cucumber. All are popular JavaScript test frameworks that have been around for years, but why do we need a test framework? Isn’t that the role of WebdriverIO?

No, not really. One of the great things about WebdriverIO is that it relies on existing solutions instead of trying to invent its own. This gives you a ton of features with proven code that the community has already tested through years of use.

In this case, the features we’re getting are improved organization and better error reporting, along with additional functionality such as “pre” and “post” test hooks. We’ll look at all that later, but back to our options.

While the Jasmine framework is more than capable of meeting our needs, all the examples in this book will be using Mocha. This is for two reasons:

Mocha is a very popular framework, meaning you’ll run into it more often.
Almost all my professional experience has been on projects using Mocha, so I’m much more familiar with how it works and how you can take advantage of it.

With that said, hit that enter button again to choose Mocha.

You should now be prompted with the question Do you want to run WebdriverIO commands synchronous or asynchronous? with the choices sync and async.

If you recall in our first test script, there were async and await keywords scattered throughout the file:

Code snippet

(async () => { 	const browser = await remote(...); 	await browser.url('https://webdriver.io');	const title = await browser.getTitle ();	await browser.deleteSession(); })(). catch (( e ) => console.error (e));

I also mentioned async/await earlier in this chapter. Why is this all needed?

When commands are issued to an automation server, they don’t happen immediately. There is a variable amount of time between sending the command and the browser actually running it. Usually it only takes a few milliseconds, but in some cases it can take a second or two.

What this means for our script is that we need to wait for that command to finish before moving on to the next step. That’s done through these await keywords (async is needed to tell Node.js that we’ll be using these keywords.)

While helpful, they clutter up the code of our test. Wouldn’t it be nicer if we didn’t have to type all that out, so that it looked more like this:

Code snippet

const browser = remote (...); browser.url ('https://webdriver.io');const title = browser.getTitle ();browser.deleteSession ();

That’s a lot cleaner and easier to scan. Well, that’s a feature the test runner gives you. It modifies the code you’re running to automatically include those await statements, without having to do the work yourself.

Going back to our options, if you wanted to, you could stick with the async style of writing all your await keywords yourself. But I’m lazy, so I’ll take any chance I can to do less work.

I’m going to assume you’re like me and want to go with WebdriverIO handling the async/await nature of our tests (although you probably have better reasons than “I’m lazy”). To do that, choose sync as your option and hit enter. We’ll still have a way to run custom async commands, but for now, we’ve enabled a much simpler way to write our tests.

Next, it asks Where are your test specs located? (./test/specs/**/*.js).

By default, they mark the location as the test/specs folder. However, you can change this, if you prefer. We haven’t created the folders yet, but will in a second, so go with the default by pressing enter. Again, you can customize this if you prefer, but the examples in the book assume you used this default path.

Just so you know, the ** and * in the path is called a Glob pattern. That’s a convention that defines where multiple files are located.

The ** section says to look in all the subfolders for files, so if we later organize our tests by feature (say we add a login or checkout folder), WebdriverIO will know to look in those subfolders as well as the main test/specs folder.

The * portion in *.js matches any file that ends with a .js extension. So it will match test.js and login.js but not test.txt or just login.

Okay, back to the questions. After that test file location, it asks Do you want WebdriverIO to autogenerate some test files? (Y/n). We’re going to say no here, as I’ll be walking you through everything from scratch.

After entering n for no and hitting enter, it asks “Are you using a compiler?”. Compilers are useful utilities for improved code support, but are more complicated than I want to get into. We’re going to stick with the default emphatic answer of No!.

Next it asks “Which reporter do you want to use?”. Here we can make multiple choices.

Reporters are utilities that display the results of our tests in a more consumable format. Some print directly to the terminal and others display as websites you visit. Most of the reporters in this list are intended for advanced usage. The two you’d be interested in right now are dot and spec. dot is the simpler of the two, printing out either red or green dots depending on whether the test passed for failed (I’ll let you guess which color means what). spec will print out the name of the test in a hierarchical fashion, giving you a lot more information on how the test ran.

Here are the two reporters showing the results of the same test run:

Terminal Output showing pass/failure reporting in 'dot' and 'spec' reporters — Terminal Output showing pass/failure reporting in ‘dot’ and ‘spec’ reporters

On top is dot, which is quite succinct. spec is much more verbose, giving us more detail into which specific test failed.

I much prefer spec over dot, as I need to know these details. You’ll probably want it as well, so leave it selected as the default choice.

We’re now asked if we want any services set up for us. Services are useful extensions to the main WebdriverIO functionality. There are many official services, a plethora of third-party ones, and you can even write your own.

There are two services we’d really be interested in installing right now. They are the chromedriver and selenium-standalone service.

We’ve talked about both of these before, so what’s going on with these services? Well, if we want to run ChromeDriver or Selenium Standalone, we have to manually start and stop the servers ourselves. What these two services do is hook into the WebdriverIO startup/shutdown sequence and automatically start/stop the server. This is really helpful long-term, and if you are going to use either ChromeDriver or Selenium Standlone, I highly recommend using the related service.

I’m going to choose ChromeDriver, since it’s a little bit faster than using Selenium Standalone and doesn’t require Java to be installed.

An important note here. You shouldn’t use both services at the same time. While you technically can, since they both provide the same functionality (running a WebDriver server), it doesn’t make sense to have them both running.

Another important note: You can shut down a ChromeDriver or Selenium Standalone instance running in your terminal from the previous chapter. Now that we have our service set up, it will take care of starting/stopping that for us.

To finish off, the last question asks what baseUrl we want. By setting this value, we can shorten our browser.url calls. We’ll talk about it soon, but for now just type https://webdriver.io and hit enter.

That completes the list of questions. Now WebdriverIO takes over and runs the NPM installs for the packages we requested, along with building out our configuration file.

If all went well, you should see this message:

Code snippet

Packages installed successfully, creating configuration file ... Configuration file was created successfully! To run your tests, execute:$ npx wdio run wdio.conf.js

If you made a mistake and need to run the config generator again, you can either delete your wdio.conf.js file and run the command again, or manually request the config builder with npx wdio config. This second option will overwrite any existing wdio.conf.js file.

As many options as we covered just now, the config file actually sets quite a few more. We’ll take a look through this newly generated file next.

1.2.5 Reviewing the Standard WebdriverIO Configuration File

We just used the WebdriverIO test runner to help us create a configuration file to hold all of our settings. Let’s take a look at that file.

Open the newly created wdio.conf.js file in your text editor of choice. The file is saved in JavaScript format, so using a text editor with JavaScript syntax highlighting is a good idea.

In our file, the first thing we’ll check out is the exports.config line. If you’re familiar with Node.js, you’ll recognize the use of the exports global variable.

Because this is just a plain JavaScript file, it gives us the ability to run normal Node.js commands. For example, at the very top of the file, add a console.log command to show that we’re in fact using this file.

Code snippet

console.log('I am inside your configuration file, running your tests!') 
exports.config = { // rest of the file would be below

There are some added benefits to being able to run Node.js code. One common usage is to customize your WebdriverIO configuration based on environmental variables. Alternatively, you can handle that through custom command line arguments. This is a pretty in-depth topic though, so we’ll leave that alone for now.

Let’s check out all the settings defined in our file. There are actually several more options here than just what we answered during our config step and it’s good to know what those are.

Runner

Code snippet

// ==================== // Runner Configuration // ==================== // // WebdriverIO allows it to run your tests in arbitrary locations (e.g., locally// or on a remote machine). runner : 'local' ,

The first setting is runner, which is based on our answer to the first question in the configuration. We’re going to skip this option as it’s not much use to us right now.

Specs and Exclude

Code snippet

// ==================// Specify Test Files// ==================// Define which test specs should run. The pattern is relative to the directory// from which `wdio` was called. Notice that, if you are calling `wdio` from an// NPM script (see https://docs.npmjs.com/cli/run-script) then the current working// directory is where your package.json resides, so `wdio` will be called from there.// specs: [	'./test/specs/**/*.js'],// Patterns to exclude.exclude: [	// 'path/to/excluded/files'],

Next is ‘specs’, which defines the folder path to our tests. We talked about this during our config setup, but notice that it’s stored as an array. The reason for that is so we can add multiple patterns to search for, or specific files we’d like to run.

Likewise, there’s an exclude option, which allows us to exclude files based on a pattern or specific path.

Both patterns are relative to the directory from which wdio was called.

Let’s move on to the capabilities section, which groups the information needed by WebdriverIO to initiate our desired WebDriver session. There are two parts to it:

Max Instances

Code snippet

// First, you can define how many instances should be started at the same time. Let's// say you have 3 different capabilities (Chrome, Firefox, and Safari) and you have// set maxInstances to 1; wdio will spawn 3 processes. Therefore, if you have 10 spec// files and you set maxInstances to 10, all spec files will get tested at the same// time and 30 processes will get spawned. The property handles how many capabilities// from the same test should run tests.//maxInstances: 10,

WebdriverIO can run multiple capabilities at the same time, saving time in your test runs.

Say you have five test files, and each test file takes exactly one minute to run.

Without this ability, the total time it would take your tests to run is five minutes. As each test file finishes, the next file is started.

If you set maxInstances to 3, WebdriverIO will start three separate sessions to run your separate files. The first three tests would all run at the same time, taking one minute to complete overall. After the first two tests finish, WebdriverIO will automatically start running the remaining two tests. In total, the test suite would take two minutes to run (one minute for the first three tests and an additional minute for the final two).

If you set maxInstances to 5, all five files would run at the same time, taking a total of one minute for all your tests to run.

A word of warning: While running tests in parallel can certainly save time, it also requires you to have a powerful computer to process all the commands. If you find your CPU is maxing out when running all your tests, consider lowering this number to reduce the strain on your processor. Maxing out your CPU can definitely lead to failures in your test runner as the computer isn’t able to keep up with all the requests.

Capabilities

Code snippet

// If you have trouble getting all important capabilities together, check out the// Sauce Labs platform configurator - a great tool to configure your capabilities:// https://docs.saucelabs.com/reference/platforms-configurator//capabilities: [{	// maxInstances can get overwritten per capability. So if you have an	// in-house Selenium grid with only 5 firefox instances available you can	// make sure that not more than 5 instances get started at a time.	maxInstances: 5,	//	browserName: 'chrome',	acceptInsecureCerts: true	// If outputDir is provided WebdriverIO can capture driver session logs	// it is possible to configure which logTypes to include/exclude.	// excludeDriverLogs: ['*'], // pass '*' to exclude all driver session logs	// excludeDriverLogs: ['bugreport', 'server'],}],

Up next is our capabilities setting. Again, the value is an array, which means that we can run our tests in multiple browsers each time we use the test runner.

The configuration here looks similar to what we provided in the desiredCapabilities of our first test.js file. Your capabilities setting can get pretty complex, but for now we can leave it as is.

Note that within your capabilities you can overwrite the spec, exclude, and maxInstances options in order to group specific specs to a specific capability.

Log Level

Code snippet

// ===================// Test Configurations// ===================// Define all options that are relevant for the WebdriverIO instance here//// Level of logging verbosity: trace | debug | info | warn | error | silentlogLevel: 'info',

Next in our settings file is the test configurations section. These levels correspond to common log levels in code output, and the higher up the list you choose (trace being highest), the more log output you’ll get. This can be helpful for debugging later on. silent, on the other hand, doesn’t log anything when running tests, and can be useful for avoiding a lot of noise in your test run.

It defaults to info, which is a good mix between the two ends of the spectrum. Feel free to change this as you see fit.

Bail

Code snippet

// If you only want to run your tests until a specific amount of tests have failed// use bail (default is 0 - don't bail, run all tests).bail: 0,

The bail setting can be useful if you only want to run your tests until a specific amount of tests have failed use. This can help save time when debugging tests.

Base URL

Code snippet

// Set a base URL in order to shorten url command calls. If your `url` parameter// starts with `/`, the base url gets prepended, not including the path portion// of your baseUrl. If your `url` parameter starts without a scheme or `/`// (like `some/path`), the base url gets prepended directly.baseUrl: 'http://localhost',

We’ve already talked about baseUrl, so we’ll jump past that setting as well.

“waitFor” Timeout

Code snippet

// Default timeout for all waitFor* commands.waitforTimeout: 10000,

waitforTimeout defines how long waitFor commands should wait until erroring out. We haven’t covered these commands yet, so let’s leave this at the default value.

Connection Retry Options

Code snippet

// Default timeout in milliseconds for request// if browser driver or grid doesn't send responseconnectionRetryTimeout: 120000,//// Default request retries countconnectionRetryCount: 3,

The connectionRetryTimeout and connectionRetryCount options are useful to adjust if you’re having trouble connecting to your Selenium Grid. You should be good to leave these alone though.

Services

Code snippet

// Test runner services// Services take over a specific job you don't want to take care of. They enhance// your test setup with almost no effort. Unlike plugins, they don't add new// commands. Instead, they hook themselves up into the test process.services: ['chromedriver'],

Here’s an array of services we have chosen to run when using the test runner. These services provide added features. In this instance, we’ve got the ‘chromedriver’ service enabled (assuming you chose to install it). The overall option is an array, so yes, you can have multiple services running at once.

Do note that not all services work together. For example, you wouldn’t want to have the following services array:

Code snippet

services: ['selenium-standalone', 'chromedriver', 'sauce'],

All three of those services do the same thing: host a WebDriver server for you to run your tests against. By having all three services set, they’d have conflicts when starting up. Instead, you’d just choose one and go with it.

Framework

Code snippet

// Framework you want to run your specs with.// The following are supported: Mocha, Jasmine, and Cucumber// see also: https://webdriver.io/docs/frameworks.html//// Make sure you have the wdio adapter package for the specific framework installed// before running any tests.framework: 'mocha',

Next is the framework setting set to mocha, which is what we provided.

Reporters

Code snippet

// Test reporter for stdout.// The only one supported by default is 'dot'// see also: https://webdriver.io/docs/dot-reporter.htmlreporters: ['spec'],

You’ll also see the reporter option set to ‘spec’. Since this is an array, you can pass in multiple reporters if you’d like.

Mocha Options

Code snippet

// Options to be passed to Mocha.// See the full list at http://mochajs.org/mochaOpts: {	ui: 'bdd',	timeout: 60000},

The mochaOpts option is useful for passing configurations for Mocha to use. Here it defines using the bdd ui type, which says we want to write our tests in the Behavior-driven Development style.

A timeout setting of 60000 is provided, which says that after 60000 milliseconds of running (or 60 seconds), our test will time out. Some tests can take up to three minutes to run, so updating this value to 180000 can be useful if your tests take a little longer to complete.

There are more options you can define here for Mocha, and you can get a full list of options to set via the Mocha website.

Hooks

Finally, there are several hooks we can use to add functionality in the middle of the test process. This opens up a fair amount of potential to really add on to the default WebdriverIO test runner functionality, and we’ll look at an example of that later on.

The End

That’s the end of our settings file. We’ll be jumping back in this file often to adjust settings and add new functionality.

For now though, let’s get our test file setup so we can try out all these new features.

1.2.6 Running the Example Test Runner Test

Now that we’ve created and reviewed our configuration file, it’s time to write a test to make use of it.

Since this is the first time we’re creating a test, we need to create our folders for the test files to go into.

During setup, we defined out specs path as ./test/specs/**/*.js. But we don’t have a test folder, nor a specs folder inside it.

Let’s create those folders by running the following command from our wdio-testrunner project path:

Code snippet

mkdir -p test/specs

On Windows that’s:

Code snippet

mkdir test\specs

That will create both folders for us. Now we can create our first test file:

Code snippet

touch ./test/specs/example.js

On Windows, touch will not work, so you can instead go to the test/specs/ folder and create a JavaScript file named ‘example.js’

Open that file in a text editor and paste in the example from the official WebdriverIO website:

test/specs/example.js

Code snippet

describe ('webdriver.io page', () => {	it('should have the right title', () => {		browser.url('https://webdriver.io');		expect(browser).toHaveTitle(			'WebdriverIO · Next-gen browser and mobile automation test \framework for Node.js'		);	});});

There is one thing we want to change. Where it sets the url, we instead want to use the baseUrl we defined in our configuration file.

So change browser.url('https://webdriver.io'); to be browser.url('./');. When WebdriverIO runs that command, it will see that it isn’t a full website URL, and prepend our baseUrl from the config file.

Okay, let’s save the file and try running the test.

Running Through Chrome DevTools

Since the Chrome Devtools protocol only requires a supported browser to be installed, if you have a browser like that already installed, you don’t need to do anything else. Do check that if you selected chromedriver from the services list during the wdio config section, you go to your wdio.conf.js file and comment out the services line (this was chromedriver won’t start up when running your test).

Now, to run the tests, we call the testrunner in our terminal:

Code snippet

npx wdio

The test will run, you should see a browser pop-up, and successful test output that looks something like:

Terminal output running our test via Chrome DevTools protocol

Again, notice the line INFO webdriverio: Initiate new session using the devtools protocol. That let’s us know we really are using the DevTools protocol.

Running via Selenium Standalone

Assuming you selected the Selenium Standalone service during our config run-through (and have it enabled in your services section in your wdio.conf.js file, services: ['selenium-standalone']), we don’t have to do anything to run our tests through Selenium Standalone. This is because the service takes care of starting and stopping our Selenium for us.

The one thing we should check on is making sure the Selenium server we started up earlier isn’t still running. Check your terminal windows to make sure this isn’t the case (remember you can stop the server by sending the ctrl-c key command).

You can also go to http://localhost:4444/wd/hub and make sure it returns a “site can’t be reached” error.

Okay, with that out of the way, let’s run our tests by calling the WDIO testrunner inside our terminal:

Code snippet

npx wdio

Like before, after a few moments, test output will start appearing. You’ll probably see a browser window pop up. Just let it sit there and do it’s thing (this can make for a neat magic trick for someone unfamiliar with automation).

The test will take a few seconds to run, then shut everything down. Once finished, it’ll print out the final results, something like:

Terminal output running our test via Selenium Standalone

And remember, our selenium-standalone service took care of shutting down the server it started, so we don’t have to do anything.

If you want to try this out, but didn’t select the selenium-standalone service during the config steps, you can manually install the package by running:

Code snippet

npm install @wdio/selenium-standalone-service

Then add it to your services array and give it a shot.

Running with ChromeDriver

ChromeDriver also comes with its own WebdriverIO service, allowing us to automatically start and stop a chromedriver server during our test run.

If you didn’t select chromedriver from the services list during the config, installation and usage is much like selenium-standalone, beginning with our NPM install step.

You’ll need to install two packages:

Code snippet

npm install wdio-chromedriver-service chromedriver

Even though, in an earlier example, we installed the chromedriver package globally, we’re installing it locally so that it’s added to our package.json file.

Information on the packages we install locally automatically get saved to a dependency section of our package.json file by NPM. This gives us the ability to re-install any missing dependencies later on by running npm install. This is helpful if you want to load your project from scratch.

With the installation done, we need to do two more things. Jump back into your wdio.conf.js file and skip down to the services section. Ensure that the only service defined is chromedriver (e.g., services: ['chromedriver'])

As I’ve mentioned before, you don’t want to have both selenium-standalone and chromedriver in your services at the same time as they would conflict with each other.

With that done, let’s talk about one more thing. You may recall during our first time using ChromeDriver that we had to update the port and path settings in our WebDriver session initialization. Well, to help us out, the ChromeDriver service configures this for us. Before you’d need to set this manually in your config, but now it’s all handled in the service.

With that all set up, let’s run our tests again with our wdio command:

Code snippet

npx wdio

The output should look almost identical to before:

Terminal output running our test via ChromeDriver

Notice the line INFO webdriver: [POST] http://localhost:9515/session. See how the port is set to 9515? That’s the ChromeDriver service managing it all for us!

Whether you use ChromeDriver or Selenium Standalone is entirely up to you. The benefit of ChromeDriver is that you don’t need to have Java installed to run, but the drawback is that you can only run your tests in Chrome.

On the other hand, Selenium Standalone installs and manages all the browser drivers for you, so you’re free to run most any browser installed on your computer.

Personally I use either of them depending on the situation. I usually lean on ChromeDriver when starting out, as it’s usually quicker to run, then switch to Selenium Standalone when I need to expand my browser coverage.

For the rest of the book, we’ll be using ChromeDriver going forward.

Now that we’ve got our test running, let’s take some time to review the actual code.

Reviewing the Example Test Code

Just for a quick comparison, below is the sample code for standalone mode versus running through the test runner:

Standalone:

test.js

Code snippet

const { remote } = require('webdriverio');(async () => {	const browser = await remote({		capabilities: {			browserName: 'chrome'		}	});	await browser.url('https://webdriver.io');	const title = await browser.getTitle();	console.log('Title was: ' + title);	await browser.deleteSession();})().catch((e) => console.error(e));

Test Runner:

test/specs/example.js

Code snippet

describe ('webdriver.io page', () => { 	it('should have the right title', () => { 		browser.url ('https://webdriver.io'); 		expect (browser).toHaveTitle (			'WebdriverIO · Next-gen browser and mobile automation test \ framework for Node.js' 		); 	}); });

Overall our test runner code is much more succinct. This is due to quite a few things:

No need for remote and setting up our browser object: This is all done via the configuration file and test runner
No need for async and await keywords: By running our tests in sync mode, we no longer need to add all that extra code.
No need to delete the session: The test runner manages both creating and ending the sessions, so we don’t manually have to do this.

Those are all the things that have been removed, but there are a couple new bits added as well.

There are these describe and it function calls. They are part of the Mocha framework, and help organize our tests into individual test cases. This is helpful for overall readability, plus we can use them to run or exclude certain cases from our test runs.

describe is used to group sets of tests by the feature they are testing.
it defines a specific test to run.

There are usually multiple it functions nested inside each describe, and sometimes describe functions are nested inside each other for a better defined test hierarchy.

As you can see in our file, describe is called as a function, passing in two parameters.

Code snippet

describe ('name of test section', function () { });

The first parameter is a name for the feature we are testing, in this case 'name of test section'. The second half of our describe call is a function which contains all of the code we want to associate with this feature.

Inside this function we’ll be adding the it call. Similar to describe, it is a function call that takes two parameters:

Code snippet

it('name of individual test', function () { });

The first parameter is the name we give our test, in this case 'name of individual test'. I like to begin my test case names with should, so that it reads it('should do whatever'...
The second parameter is a function which contains our browser commands and assertions.

To bring it all together, here’s an example test file with just the structure in place:

Code snippet

describe('Login Page', function () {	it('should allow you to log in using valid credentials', function () {		// valid login code here	});	it('should not allow you to log in using invalid credentials', function () {		// invalid login code here	});});

1.2.7 Command Line Options (and Logging)

Now that we’ve gotten comfortable with the Test Runner, let’s dig into some advanced usage of the tool.

From the start, there are several command line overrides you can use to customize your tests runs on a case-by-case basis.

To get a full listing of all options, pass in the help option when running npx wdio:

Code snippet

npx wdio --help

It will output the various settings you can update from the command line.

Terminal Output from `npx wdio --help` command

There are plenty of settings to configure, but most you won’t use. There are a few common settings you’ll tinker with on a regular basis though. Let’s take a look at them.

Config File

If you look at the first example near the bottom, it shows:

Code snippet

wdio run wdio.conf.js --suite foobar Run suite on testsuite "foobar"

There are three parts: First, the wdio run command (note npx wdio and npx wdio run do the same thing, as wdio calls run by default).

Then, a path to a configuration file (i.e., wdio.conf.js, we’ll talk about this in just a second).

Finally, an option to set the suite to foobar. We can provide many options here, and we’ll get to that soon.

Back to the configuration file. By default, WebdriverIO will use the wdio.conf.js file (assuming you have one). If you want to have a different configuration file with different settings, this is how you can get WebdriverIO to use it.

Say I create a second configuration file called wdio.alternative.conf.js. I can use that file by running:

Code snippet

npx wdio wdio.alternative.conf.js

Now WebdriverIO will use that instead of wdio.conf.js.

As you get into more advanced usage of the framework, you’ll probably want to have custom configuration files. We won’t be covering that in this book though, as it’s a bit too advanced for our needs. So, let’s move on and explore those options.

`Run` Options

In the usage examples, you see two main options being passed: suite and spec. We can get the full list of options however by running npx wdio run --help:

Code snippet

Options: --version 		Show version number 					[ boolean ] --watch 		Run WebdriverIO in watch mode   		[ boolean ] --hostname, -h automation driver host address 			[ string ] --port, -p automation driver port 						[ number ] --path path to WebDriver endpoints ( default "/" ) 		[ string ] --user, -u username if using a cloud service as automation backend [ string ] --key, -k corresponding access key to the user 			[ string ] --logLevel, -l level of logging verbosity 				[ choices: "trace" , "debug" , "info" , "warn" , "error" , "silent" ] --bail stop test runner after specific amount of tests have failed [ number ] --baseUrl shorten url command calls by setting a base url [ string ] --waitforTimeout, -w timeout for all waitForXXX commands [ number ] --framework, -f defines the framework ( Mocha, Jasmine or Cucumber ) to run the specs [ string ] --reporters, -r reporters to print out the results on stdout [ array ] --suite overwrites the specs attribute and runs the defined suite [ array ] --spec run only a certain spec file - overrides specs piped from stdin [ array ] --exclude exclude certain spec file from the test run - overrides exclude piped from stdin [ array ] --mochaOpts Mocha options --jasmineNodeOpts Jasmine options --cucumberOpts Cucumber options --help Show help [ boolean ]

That’s a lot of choices! Let’s look at the most useful ones…

Spec

The first option you’ll likely be using often is passing in a specific spec path.

This is quite helpful when you need to test out a specific file. Instead of having to test all the files in your test folder, you can single one out from the rest.

How about a couple of examples? Say you have the following test files:

Code snippet

/test/homepage.js /test/search.js /test/auth/login.js /test/auth/logout.js /test/auth/register.js

Five files total, with three of them being inside the test/auth folder.

Assuming the specs setting in your configuration file is set to ./test/**/*.js, normally you’d run all five tests on each run.

If you want to run just the homepage test, you could do any one of the following:

Code snippet

npx wdio --spec=./test/homepage.js npx wdio --spec=homepage.js npx wdio --spec=home

The spec option takes either the exact path to the file you want to test, or a filter by filename. So all three variations above work because ./test/homepage.js matches against all three.

Here’s a different example: I want to run just the login/logout tests. I could do any of the following:

Code snippet

npx wdio --spec=./test/auth/login.js --spec=./test/auth/logout.js npx wdio --spec=login --spec=logoutnpx wdio --spec=./test/auth/log npx wdio --spec=log

Notice in the first two options, we define the spec option twice, allowing us to choose two different files specifically.

As mentioned before, the spec option is very helpful when debugging individual tests. We’ll be using it a lot going forward.

Bail

One option we briefly talked about while going through the configuration file is bail. This will tell WebdriverIO to stop running tests after a certain number of test failures.

In our configuration, we have it set to 0, which means that it will run all the tests no matter the number of failures. This is useful when we want to see how all of our tests run.

But when debugging a set of tests, it can be useful to stop running them if there’s a failure. Maybe you’re uncertain if all your tests will pass, and you just want to see if there are any failures at all.

By passing in --bail=1 as a command-line option, we can achieve that.

Code snippet

npx wdio --bail=1

Of course, we could set this to any number, but 1 is what will be used most, as you’ll want to get back to fixing your broken tests right away.

Base URL

Overriding the baseUrl bcan be helpful for times when you need to test the same site on a different server. Most often this occurs when you’re testing a server on your local computer versus the test server. It can also be useful for one-time tests of special servers spun up to run specific code.

In our settings, we defined our baseUrl bas https://webdriver.io b. And in our test, we used that URL by changing our browser.url bcall.

Let’s say we want to test the old version of the WebdriverIO website. That URL is http://v4.webdriver.io b. So to pass it in via the command line, we’d run:

Code snippet

npx wdio --baseUrl=http://v4.webdriver.io

When we run that, we’re going to get an assertion error. That’s because it went to the old site, which had a different page title from the new one. At least we know our test correctly catches errors!

Terminal output showing error message if `baseUrl` is incorrect

I use the baseUrl override only every so often, but it’s certainly handy to have around.

Log Level

Let’s take a look at another option. We’ve run across the logLevel a couple times, which defines how much console output to show when running our tests.

When you’re debugging your tests, it can be helpful to see all the logs that WebdriverIO outputs. Other times you may not want any output at all. You can tweak this on a run-by-run basis by setting the logLevel option.

So if I want to get as much output as possible, I’d do:

Code snippet

npx wdio --logLevel=trace

If I didn’t want any logs, I’d do:

Code snippet

npx wdio --logLevel=silent

Let’s try out that first option. As the tests run, you should see logs appearing, which are describing what’s currently happening. For example, the start of our logs would show:

Code snippet

2020-07-24T22:50:20.066Z INFO @wdio/cli:launcher: Run onWorkerStart hook2020-07-24T22:50:20.067Z INFO @wdio/local-runner: Start worker0-0 with arg: --logLevel=trace[0-0]2020-07-24T22:50:20.425Z INFO @wdio/local-runner: Run worker command: run[0-0]2020-07-24T22:50:20.431Z DEBUG @wdio/local-runner:utils: init remote session

As the test continues, old messages will “scroll” off the top of the log, while new messages appear at the bottom.

Notice that each log message has a ‘type.. The first message in the example output above is INFO and the last is DEBUG. These types are defined inside the core WebdriverIO code, and generally denote different types of messages.

The log level is a hierarchical setup, having five levels:

Trace
Debug
Info
Warn
Error

When setting the log level to trace, you’ll get all the trace messages, plus all the types below it (so debug, info, warn, and error).

If you were to choose warn, you’d only get those messages, plus any error messages.

After our tests have completed, WebdriverIO prints out the entirety of messages in a single section. Here’s what that looks like:

Code snippet

2020-07-24T22:50:19.909Z DEBUG @wdio/utils:initialiseServices: initialise service"chromedriver"as NPM package2020-07-24T22:50:19.933Z INFO @wdio/cli:launcher: Run onPrepare hook Starting ChromeDriver84.0.4147.30(48b3e868b4cc0aa7e8149519690b6f6949e110a8-ref s/branch-heads/4147@{#310}) on port 9515Onlylocalconnections are allowed.Please see https://chromedriver.chromium.org/security-considerationsforsuggest ions on keeping ChromeDriver safe. ChromeDriver was started successfully.2020-07-24T22:50:20.066Z INFO @wdio/cli:launcher: Run onWorkerStart hook2020-07-24T22:50:20.067Z INFO @wdio/local-runner: Start worker0-0 with arg: --logLevel=trace[0-0]2020-07-24T22:50:20.425Z INFO @wdio/local-runner: Run worker command: run[0-0]2020-07-24T22:50:20.431Z DEBUG @wdio/local-runner:utils: init remote sessi on2020-07-24T22:50:20.433Z INFO webdriverio: Initiate new session using the ./prot ocol-stub protocol[0-0]RUNNING in chrome - /test/specs/example.js[0-0]2020-07-24T22:50:20.576Z DEBUG @wdio/utils:initialiseServices: initialise service"chromedriver"as NPM package[0-0]2020-07-24T22:50:20.592Z DEBUG @wdio/local-runner:utils: init remote sessi on[0-0]2020-07-24T22:50:20.592Z INFO webdriverio: Initiate new session using the webdriver protocol[0-0]2020-07-24T22:50:20.599Z INFO webdriver:[POST]http://localhost:9515/sess ion[0-0]2020-07-24T22:50:20.599Z INFO webdriver: DATA{capabilities:{alwaysMatch:{browserName:'chrome', acceptInsecureCerts:true}, firstMatch:[{}]}, desiredCapabilities:{browserName:'chrome', acceptInsecureCerts:true}}[0-0]2020-07-24T22:50:22.453Z INFO webdriver: COMMAND navigateTo("https://webdriver.io/")[0-0]2020-07-24T22:50:22.454Z INFO webdriver:[POST]http://localhost:9515/sess ion/32eec5f507ed6f86473e3c4b685fd96b/url[0-0]2020-07-24T22:50:22.454Z INFO webdriver: DATA{url:'https://webdriver.io/'}[0-0]2020-07-24T22:50:23.711Z INFO webdriver: COMMAND getTitle()[0-0]2020-07-24T22:50:23.711Z INFO webdriver:[GET]http://localhost:9515/sessi on/32eec5f507ed6f86473e3c4b685fd96b/title[0-0]2020-07-24T22:50:24.119Z INFO webdriver: RESULT WebdriverIO · Next-gen bro wser and mobile automationtestframeworkforNode.js[0-0]2020-07-24T22:50:24.124Z INFO webdriver: COMMAND deleteSession()2020-07-24T22:50:24.124Z INFO webdriver:[DELETE]http://localhost:9515/session/ 32eec5f507ed6f86473e3c4b685fd96b2020-07-24T22:50:24.292Z DEBUG @wdio/local-runner: Runner0-0 finished withexitcode0[0-0]PASSED in chrome - /test/specs/example.js2020-07-24T22:50:24.297Z INFO @wdio/cli:launcher: Run onComplete hook

Let’s take a brief detour to walk through this activity. The first thing WebdriverIO does is initialize any services we requested. In this instance, we’re using the chromedriver service, so we see log output for that.

Next, it runs any onPrepare hooks we have defined. We don’t have any defined, but the ChromeDriver service does. That’s why you see output stating that ChromeDriver is starting up. This is how services “hook” into the WebdriverIO flow to add functionality.

Following that, WebdriverIO starts “workers” for our tests. These are sub-processes spun up that our test will run in. The point of doing this is to allow for multiple tests to run at the same time (they’d all be different “workers”).

It then sends the run command to the worker (i.e., INFO @wdio/local-runner: Run worker command: run), letting it know it should get to work. The worker itself then initializes the chromedriver service, in case it needed to run anything inside of a specific worker (which it doesn’t).

The next step is to get a browser running for use. These are called “sessions”. To get one, WebdriverIO sends a POST request (INFO webdriver: [POST] http://localhost:9515/session) with the capability data. This data is sent to chromedriver, which receives this request and initializes the session with the provided data. Combined, the two logs are:

Code snippet

[0-0]2020-07-24T22:50:20.599Z INFO webdriver:[POST]http://localhost:9515/session[0-0]2020-07-24T22:50:20.599Z INFO webdriver: DATA{	capabilities:{		alwaysMatch: { browserName: 'chrome', acceptInsecureCerts: true },		firstMatch:[ {} ]	}, 	desiredCapabilities: { browserName: 'chrome', acceptInsecureCerts: true }}

You’ll notice this matches the capabilities settings in our config file for the most part, but has a couple of different options specified. These are WebdriverIO defaults used to start a normal browser session. We could override them via the capabilities object in our config file if we so desired, but we don’t right now, so we’ll omit them.

Understanding the relationship between WebdriverIO and WebDriver/Selenium is helpful, so I want to take a little bit of extra time to review it. WebdriverIO doesn’t actually run the browser automation, WebDriver (or whatever WebDriver endpoint we’re using) takes care of all of that.

What WebdriverIO does provide is a JavaScript interface for sending commands to run. It does this through REST API calls, which means that it sends an HTTP request to specific endpoints on the WebDriver server (e.g., ChromeDriver). Basically, WebdriverIO and WebDriver have a common language they share (defined in the official WebDriver spec) to send data back and forth. WebdriverIO sends commands to the WebDriver server for it to run, then the server sends the results of those commands back.

Take the getUrl command in the next few lines. WebdriverIO sends a request to the Chromdriver instance. In the request, it passes along information about the URL for the browser to go to. After ChromeDriver receives and processes this request, it returns the results of the command execution (which is empty). WebdriverIO doesn’t output this information, as there isn’t anything to show.

Overall, the logs for this look like:

Code snippet

[0-0]2020-07-24T22:50:20.599Z INFO webdriver:[POST]http://localhost:9515/session[0-0]2020-07-24T22:50:20.599Z INFO webdriver: DATA {	capabilities:{		alwaysMatch: { browserName:'chrome', acceptInsecureCerts:true }, 		firstMatch:[ {} ]	}, 	desiredCapabilities: { browserName: 'chrome', acceptInsecureCerts: true }}

In the next set of logs, WebdriverIO requests the page title from the browser. No data is sent to ChromeDriver, as there isn’t any information to send. Instead, ChromeDriver returns data back to us, namely, the title of the page. You see this in the “result” log output and sequentially, the console output we sent in our test:

Code snippet

[0-0]2020-07-24T22:50:23.711Z INFO webdriver: COMMAND getTitle()[0-0]2020-07-24T22:50:23.711Z INFO webdriver:[GET]http://localhost:9515/session/3 2eec5f507ed6f86473e3c4b685fd96b/title[0-0]2020-07-24T22:50:24.119Z INFO webdriver: RESULT WebdriverIO · Next-gen browserand mobile automationtestframeworkforNode.js

Finally, WebdriverIO closes our browser by sending a DELETE request for our session endpoint. Once it gets the successful close message, it closes down the worker and runs the onComplete hook. With that the test is complete.

That was a lot of logs. What if we want to ignore all this output and only show the basics (plus any console.log messages we may have added). Let’s see what the log output looks like when running it with logLevel set to silent:

Code snippet

Starting ChromeDriver84.0.4147.30(48b3e868b4cc0aa7e8149519690b6f6949e110a8-refs/branch-heads/4147@{#310}) on port 9515Onlylocalconnections are allowed. Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe. ChromeDriver was started successfully.[0-0]RUNNING in chrome - /test/specs/example.js[0-0]PASSED in chrome - /test/specs/example.js

There’s still some output, but not much. This is helpful if you have console.log messages that you want to be able to see without extra noise.

Other Options

That sums it up for the most important options. There are other settings you can configure, but that would be too in-depth and personalized for the good of this book. Feel free to experiment, and remember, you can get all the options by running npx wdio run --help.

Introduction

1.1.1 Why Read This Book?

What is User Interface (UI) Test Automation?

Let’s Talk Benefits

There Are Always Drawbacks

Simpler Sites for UI Testing

Skipping Automation is Sometimes the Best Option

New Features Aren’t User Tested

Time Writing Tests Takes Away from Writing Features

Tests Are Only Valuable When You Use Them

Parts of the Site Might be Better Tested by People

Are Tests Worth It Then?

1.1.2 Why Use WebdriverIO?

What’s a Node.js?

Back to PhantomJS

Enter WebdriverIO

WebdriverIO is “Front-end Friendly”

It Has the Power of Selenium

It Strives for Simplicity

It’s Easily Extendable/Scalable

Custom Commands

Page Objects

Summing It Up

1.1.3 Technical Details

Versions

Git Repository

Where to check for updates/corrections

Where to find help

Errata

Technical Knowledge Requirements

Where can I freshen up?

1.2 Installation and Configuration

1.2.1 Software Requirements

Installing Node.js

Install via official site:

Install via a ‘version manager’

Getting Your Terminal Ready

A Note for Windows Users

1.2.2 Browsers and “Driving” Them

What Do We Use?

Using the Chrome DevTools Protocol

How To Use a ‘Driver’?

Installing and Running ChromeDriver

Installing and Running the Selenium Standalone Server

1.2.3 Installing WebdriverIO and Basic Usage

Running Through Chrome DevTools

Running in ChromeDriver

Running Through Selenium Standalone

Note

Note

Leaving It at That

1.2.4 Upgrading to the WDIO Test Runner

Let’s Get Going

Setting Up Our Configuration

Stepping Through the Configuration Utility

1.2.5 Reviewing the Standard WebdriverIO Configuration File

Runner

Specs and Exclude

Max Instances

Capabilities

Log Level

Bail

Base URL

“waitFor” Timeout

Connection Retry Options

Services

Framework

Reporters

Mocha Options

Hooks

The End

1.2.6 Running the Example Test Runner Test

Running Through Chrome DevTools

Running via Selenium Standalone

Running with ChromeDriver

Reviewing the Example Test Code

1.2.7 Command Line Options (and Logging)

Config File

Run Options

Spec

`Run` Options