Watir-Webdriver: Control the Browser
Watir-WebDriver (Watir is short for Web Application Testing in Ruby) is a Ruby gem which allows you to automate your browser (make it click a button, submit a form, wait for some text to appear before continuing, and so on). With its help, a real user can be simulated, allowing you to automate the full stack testing of your web application.
Be aware that Watir (classic) and Watir-WebDriver (both separate Ruby gems) are not the same thing! Watir only supports Internet Explorer, while Watir-WebDriver supports Chrome, Firefox and Safari as well. Think of Watir-WebDriver as Watir 2.0, or as Watir (classic) + WebDriver + some additional features. WebDriver was started by Google to allow browser automation tools to get closer to simulating real user behavior. Even better, all major browser automation frameworks have implemented it.
Selenium 2.0, which is a major part of Watir-Webdriver, also describes itself with this simple formula: Selenium 1.0 + WebDriver = Selenium 2.0.
Let’s (Automatically) Fill Out This Ruby Survey!
Your job is to go to this web page (you can also download it from here, unzip it and replace the ‘browser.goto’ argument with the local path to the form.html file.) and fill it out automatically with Watir-WebDriver.
To get started, first install Watir-webdriver with
gem install watir-webdriver. Also, make sure you have Firefox installed (we’ll be using it for this example).
Save the following into a new watir_script.rb file and run it:
require 'watir-webdriver' browser = Watir::Browser.new :firefox # should open a new Firefox window browser.goto 'http://nitrowriters.com/form/form.html' # or type the local path to your downloaded copy browser.text_field(:id => 'my_text_field').set 'Yes!' browser.textarea(:class => 'element textarea medium').set 'It was a long time ago, I do not remember' browser.radio(:name => 'familiar_rails', :value => '1').click # yes, I'm very familiar sleep 2 # puts the entire program to sleep for 2 seconds, so you can see the change browser.radio(:name => 'familiar_rails', :value => '3').click # actually, just a bit... browser.text_field(:name => 'favorite_1').set 'Yukihiro' # the creator of Ruby browser.text_field(:id => 'favorite_2').set 'Matsumoto' # is my favorite Ruby person! browser.checkbox(:index => 1).click # I like the TDD culture browser.checkbox(:index => 2).click # And Matz! sleep 2 # puts the entire program to sleep for 2 seconds, so you can see the change browser.checkbox(:index => 1).click # Oh well, I like only Matz.. browser.select_list(:id => 'usage').select 'Less than a year' browser.select_list(:id => 'usage').select_value '2' # Changed my mind # Here I entered C:/watir.txt because I had such a file inside my C: directory. Please be sure # to enter a valid path to a file, or your script will report 'No such file or directory' error browser.file_field.set 'C:/watir.txt' # Change this path to any path to a local file on your computer puts browser.p(:id => 'my_description').text
I highly recommend you don’t continue with this article until you’ve run this script successfully. There’s nothing more motivating than imagining the possibilities of how you can automate your browser after seeing your first Watir-WebDriver script in action. Read on after you’re done.
How a Real User Interacts With a Web Page
Go and open that web page again. Fill it in manually. Put any values you want in it. Finished? Good. Now, ask yourself: “How did I interact with this page? What did I do?” If you observe yourself carefully, you do 2 things when trying to interact with this or any other web page on the Internet:
a) Visually identify the part of the page you want to interact with (either a text box, a link, clicking on the title of a web page and so on).
b) Perform some action on that part of the page. Most of the time, you click on that part, but sometimes you also enter text. These 2 things are what you do 99% of the time: click (with your mouse) and type things (with your keyboard)
You cannot do b) without a). If you don’t visually identify what element you want to click on or type something in, then it’s pointless.
Previously, we’ve mentioned that Watir-WebDriver’s primary job is to simulate a real user. We know how a real web user interacts with a real web page (by observing ourselves filling out the form manually), so let’s take a look at how the above Watir-WebDriver script works.
Clean & Predictable Syntax
Do a quick analysis of the first 2 lines of the code:
require 'watir-webdriver' browser = Watir::Browser.new :firefox
By running just the first 2 lines, a new browser window is launched. The second line is the equivalent of a real user opening a Firefox window.
Line 3 starts giving orders to that newly launched browser window. The script tells it to go to a specific web page, which is the equivalent of a real web user typing something into the browser address bar and hitting ‘Enter’:
Things should start becoming clear starting on line 5
browser.text_field(:id => 'my_text_field').set 'Yes!'
Remember a) and b) on how a real user interacts with a web page? Here’s where the Watir-WebDriver syntax shines. A real surfer would first identify the ‘part’ of a specific page to interact with, followed by performing some action on that ‘part’. How do we tell Watir-WebDriver, in computer terms, where that part is located on the web page? Some browser automation tools use screen coordinates (we’ll later see why this is very unreliable approach). Watir-WebDriver, however, uses HTML elements, relying on a basic truism:
An HTML web page, whether it’s the SitePoint homepage or the form page we’re working with, is essentially just a collection of HTML elements.
What any user, including Watir-WebDriver, does is interact with those elements. Click on buttons (which is a HTML element), enter text in a text field (which is also a HTML element), click on links (links are HTML elements too), and so on. Whatever action you perform on a page, it’s done on some HTML element which is part of a collection we call a web page. The creators of Watir-WebDriver, knowing this, decided to create a fairly intuitive syntax to allow you to simulate almost anything a real user would do on a particular webpage. The syntax structure is:
[browser-instance].[html-element-tag-name](with specific attributes).[action]
You can see this starting at line 5:
browser.text_field(:id => 'my_text_field').set 'Yes!' # Dear Watir, please find the text field HTML element with the ID attribute of 'my_text_field' and type 'Yes!' into it
browser is the name of the browser instance variable created on line 2. This variable whatever can be named anything:
happy_browser = Watir::Browser.new :firefox
as long as you propagate the rename to each subsequent line:
text_field(:id => 'my_text_field') is the syntax Watir-WebDriver uses to identify text fields (you can see the syntax Watir-WebDriver uses for various HTML elements here). Inside the parentheses, you have key-value pairs where the key is a particular attribute and the value is the value of that attribute.
You can specify more than 1 key-value pair:
browser.radio(:name => 'familiar_rails', :value => '1').click
set is the action to perform on that HTML element. In the text_field case,
set means to set the text to ‘Yes!’.
Your New Best Friend: Right-Click Inspect
In order to fundamentally understand the how script works, it’s a good exercise to go through the thought process of how it was made in the first place.
By now, you should understand what the first 4 lines do: open a new browser window, ready to take our commands!
The next line is:
browser.text_field(:id => 'my_text_field').set 'Yes!'
browser is just the name of the variable representing the browser. What about the second
I right-clicked on the element I wanted to interact with (the text box under “Are you a big Ruby fan?”) and then clicked on Inspect (depending on the browser, this could be called ‘Inspect element’ or just ‘Inspect’). The resulting inspection window shows:
<input id="my_text_field" name="my_text_field" class="element text medium" maxlength="255" value="" type="text">
can be found in this table mapped to the
text_field method in Watir-WebDriver. Cool, but what if there were more than 1
text_field element on the page? If you just wrote
browser.text_field.set 'Yes!' and there were 3 text elements, then Watir-WebDriver would automatically select the first such element. Often, this is not what we want. We need a way to distinguish HTML tags so Watir-WebDriver can always select the right one. This is where HTML attributes come in.
If the element you want to work with has an ID attribute, you’re lucky! According to the W3C, the value of an element ID must be unique within the HTML document. In our case, the
tag has an ID, so I’ve used that. Finally, I call the
set method on that element, which performs some action on it, like, typing text into it (we’ll talk more about actions later).
Your final job is to find a unique way to identify a particular HTML element on a particular page. You don’t want Watir-WebDriver to select the wrong element or not find the element at all (in which case, your entire program would crash).
Let’s go to the next line:
browser.textarea(:class => 'element textarea medium').set 'It was a long time ago, I do not remember'
Same concept. The difference between
is that the latter supports multiple lines. You could replace
text_field in this line and the code would work, but Watir-WebDriver would issue this warning:
"Locating textareas with '#text_field' is deprecated. Please, use '#textarea' method instead.".
browser.radio(:name => 'familiar_rails', :value => '1').click sleep 2 browser.radio(:name => 'familiar_rails', :value => '3').click
Here, I’ve used 2 attributes to distinguish the radio buttons. Instead of using
click as an action method, you can also replace click with
set which sounds more user-friendly but does the same thing. You could also use the
set? method which would return true or false depending on whether that radio button has been selected:
browser.radio(:name => 'familiar_rails', :value => '1').click browser.radio(:name => 'familiar_rails', :value => '1').set? #=> true browser.radio(:name => 'familiar_rails', :value => '3').set # same as click browser.radio(:name => 'familiar_rails', :value => '1').set? #=> false
Here, the script interacts with some checkboxes:
browser.checkbox(:index => 1).click browser.checkbox(:index => 2).click sleep 2 # puts the entire program to sleep for 2 seconds browser.checkbox(:index => 1).click
If, say, there are 10 checkbox elements on the page with same class attributes of
my_checkbox, doing something like
browser.checkbox(:class => 'my_checkbox').click would select the first one it encounters.
What if we want to select the second one? Use the
The first line in the above snippet
browser.checkbox(:index => 1).click will click on the second checkbox element on page. The
:index attribute accepts integers, where 0 is the first element. The second line in the above snippet, for example, will click on the third checkbox element it finds.
What does the fourth line do? It unchecks the second element (remember, in the first line we clicked on the same element, meaning we “checked” it). There’s a clearer way to do this in Watir-WebDriver:
browser.checkbox(:index => 1).set browser.checkbox(:index => 2).set sleep 2 # puts the entire program to sleep for 2 seconds browser.checkbox(:index => 1).clear
Like with radio buttons,
set does the same thing as
clear, however,checks if the checkbox is “checked”, unchecking it as needed.
Also, you can use
set? to see if the checkbox is already checked.
Select Lists & Files
Let’s take a look at the following part of our code:
browser.select_list(:id => 'usage').select 'Less than a year' browser.select_list(:id => 'usage').select_value '2'
select_list corresponds to a
tag and its “actions” are a bit different than radio buttons or checkboxes. Each
select tag has
option tags inside which are the options in the list. Each
option tag usually has a “value” attribute and text containing the names of the options. The above code lists 2 ways you can select a specific option: either by the “value” attribute or the actual option name.
There is also a file upload box in our form. The HTML for it is:
<input id="give_me_a_file" name="give_me_a_file" class="element file" type="file"/>
Working with upload boxes is dead simple with Watir-WebDriver. All you do is this:
Warning: Enter a valid path in the
set argument or your program will crash.
As previously mentioned, real users mainly perform 2 actions on a web page: clicking and entering text.
Common sense helps a lot here. You can’t type text on a button element, for example, but you can click it. We’ve seen how you can use
clear with checkboxes and radio buttons. If you want to simple click on a textarea you could do:
browser.textarea(:class => 'element textarea medium').click
But a more user-friendly way is:
browser.textarea(:class => 'element textarea medium').focus
Quick tip: When using only 1 attribute to select an element, you’ll often see this when analyzing other peoples code:
browser.textarea(:class, 'element textarea medium').focus
Don’t use this, however, as one of the Watir-WebDriver maintainers announced they could remove it soon.
Types of Actions
Performing actions on HTML elements isn’t all about interacting with the element. In object oriented programming, you have setters and getters. Getters get something from an object while setters change the object.
Imagine every HTML element as an object and every action you can do on that HTML element as either a getter or a setter method. You can set the text for a text field element, for example. You can also get the
class attribute value for that object as well.
The last line in our form code uses a “getter” method
puts browser.p(:id => 'my_description').text
Unlike the previous examples, here we tell the browser to get us the text inside the
p tag with the id of “my description”. “For Watir demonstration purposes only.” is outputted in this case.
You can get any attribute for a particular element.. The syntax is:
[browser-instance].[HTML-element](:with-some => attributes).[the attribute name of the element] #=> the attribute value
Take the same
p element from above and, this time, get the value of the id attribute:
puts browser.p(:text => 'For Watir demonstration purposes only.').id #=> output: 'my_description'
So far, we’ve explored “getter” action mehods”. As for “setter” actions, let’s use a bit of common sense: What is a universal “action” you can do on every single HTML element?
It’s click. You can do
browser.[every-HTML-element](:with => every attribute).click.
Each HTML element also has its own specific actions. Go here and enter any HTML element name, “radio” for example. Click on that, and under “Instance Method Summary” you should see the particular methods or actions it can perform. In this example, they are
click is also found lower in the list, inherited from the
Element class, confirming the ability to use
click on any HTML element.
Unlike Firefox, if you want to use Watir-WebDriver with Chrome, Internet Explorer, or Safari, you’ll have to download the “WebDriver” editions of these browsers (as well as having the actual browsers installed on your machine). The “WebDriver edition” of a browser is a single file you need to place in your operating system load path. I put the file in my [ruby-installation-folder]/bin folder, which is already in the operating system path. I encourage you to run our form example at the beginning of this article with Chrome and IE/Safari if you’re on Windows/Mac.
In case you’re confused how this “web-driver” file fits in the big picture, think of it this way: You have all kinds of hardware on your computer. Take your graphics card. Your graphics card will be useless unless you have a way to “connect” to it by installing a driver. Think of the browser as the graphics card and the web driver file as the driver file allowing Watir-WebDriver to connect to the actual browser and “drive” it.
Vs Record and Playback Tools
There are many “Record and Playback” testing tools where you basically “record” what you’re doing on a webpage and save your actions. Later on, those tools will “playback” what you’ve done. The problem with this type of software is it often relies on screen coordinates and propriety, limited scripting languages. If there’s even a slight change in the design of the page, the whole playback will (likely) cease to work and you’re forced to re-record everything over again.
Watir-WebDriver doesn’t work this way. It uses Selenium-WebDriver on its back-end and, just like with Selenium, it’s not screen coordinates, but HTML elements. Before deciding to do anything, you must tell Watir-WebDriver the exact HTML element you want to work on.
As for Watir-WebDriver vs. Selenium WebDriver, you’ll notice that Watir-WebDriver has a more object-oriented and consistent syntax than Selenium. You need to know Selenium, though, in order to fully master Watir-WebDriver. After all, Selenium-WebDriver is what is powering Watir-WebDriver in the background. If you know Selenium, you can also perform some advanced configurations.
Only the Beginning! Isn’t it Exciting?
Once you understand the foundation of Watir-WebDriver, it’s easy to continue learning it. Things like interacting with AJAX elements shouldn’t be hard to understand once you grasp the basic syntax and the philosophy behind Watir-WebDriver.
If you’re looking for more material, 2 great books that helped me are
There’s also Web Application Testing in Ruby (free) by one of the Watir-WebDriver main contributors, containing great information like setting up the webdriver editions of each browser, etc.. The more you learn about Watir-WebDriver, the more tempted you’ll be to explore its beatiful syntax and elegance.