Model Web Pages with the Page Object Pattern

We interact with web pages every day. On the low level, clicking on various HTML elements or entering text in text forms using input devices. What’s the higher-level purpose of doing those things? It’s trying to get something done. Completing a Google search, submitting a contact form, or rating something. The Page Object pattern is a great, object-oriented way to keep your code clean while accomplishing those higher-level things as your application grows.

Key Takeaways

The Page Object Pattern enhances code reusability and maintainability by modeling web pages as classes, thus encapsulating the interaction logic away from the main test scripts.
Utilizing the `page-object` Ruby gem simplifies the implementation of the Page Object Pattern, allowing for straightforward installation and integration with Watir-Webdriver for browser automation.
Defining web elements in the Page Object class using methods like `text_field` and `button` automatically generates corresponding methods for interacting with these elements, reducing the need for repetitive code.
The Page Object Pattern significantly eases updates and maintenance by centralizing changes to web element definitions, which is especially beneficial during UI redesigns or updates.
Real-world application of the Page Object Pattern can lead to cleaner, more organized code that is easier to read and manage, especially when dealing with complex or frequently changing web interfaces.

Fundamentals of the Page Object Pattern

Why use the Page Object pattern? Two words: code reuse. Let me explain.

Suppose you’re making an application that will go to a search engine and perform a search using a particular phrase. To do this, you will need to:

Go to the web page
Locate the HTML element where you can input text and enter your keyword (the text box)
Find the Search button and click it

If you’re familiar with the Watir-Webdriver/Selenium, you know you can accomplish this easily with three lines. If you are not familiar with Watir/Selenium, please read my Watir article and then come back to this article. I’m waiting.

Suppose you needed to do each of these three actions several times in your code. What can you do? Sure, you can wrap your code in a method, and you’re done, right? But what if your application is getting more complex? Suppose that, instead of clicking the Search button, you want to click “I’m feeling Lucky” after entering your text. Or maybe, for some reason, first click the Search button twice and then enter the text and click Search again, doing that several times in your code.

You can end up with several methods containing several references to some HTML element. When the HTML element changes (due to a page redesign, it is no longer in class X but Y), you’ll need to make changes in several places in the code. Now, imagine something ten times more complicated than a search page, containing 20 different elements, which you can combine in 7 different ways. Things get really messy, really fast.

The Page Object pattern solves this problem by helping you model a single web page as a class, exposing its “services” (search for a keyword, search using “I’m Feeling Lucky”) as public methods. A web-based email page might have methods to compose an email, read the inbox, or save the message as a draft. You don’t want to bother how it’s doing that under the hood.

You can get away with Watir/Selenium if you’re dealing with a simple application and using the Page Object pattern may be overkill (I’m using a simple case through this article to illustrate how the whole gem works.) Patterns like Page Object weren’t invented for that purpose. They were created to keep your code manageable as it gets larger.

page-object: A Ruby Gem

The page-object gem is a very straightforward implementation of this pattern. To install it, just type gem install page-object and you’re done. Make sure you also have the Watir-Webdriver installed because we’ll be using it for this article.

Previously, we mentioned that the Page Object pattern allows you to model a web page, exposing its “services” as public methods. When you want to model a specific web page using the Page Object gem, you’ll go through these 3 steps:

Create the page by defining a class and including PageObject as a module.
Describe the page by specifying the HTML elements you’ll use to interact with it.
Use the page. by initializing the class as an object and calling its methods.

Create a Search Page

Let’s implement these 3 steps using a simple Search Page. To get started, we’ll create a class and include the Page Object module in it. Try running this code to see if everything is working.


require 'page-object'
class SearchPage
    include PageObject
  # some code to declare the HTML elements
end

What’s the whole point of including the PageObject module? Every module provides methods, and this is no exception. It will also include some metaprogramming magic that we’ll explore later.

Describe the Single Search Page

The web was simple back in 2006. This Yahoo Search page was no exception. Let’s try and describe it by giving it a name and pointing the location to it.

Before getting started with using the Page Object on a web page, ask yourself: “What am I trying to do on this page?“, and then: What HTML elements will I need to get those things done?

We’re starting simple with our search page, just using it to search for a specific keyword. To accomplish that, we need to define a text box and the search button.

This is a useful reference where you can see helpful instructions for defining most elements. Try doing it yourself first. My result looks like this:


class SearchPage
  include PageObject
  text_field(:search_box, id: 'yschsp')
  button(:search_button, class: 'ygbt')
end

text_field and button are simply method calls. When defining an HTML element in a Page Object (via methods like those two), there are two main things you need to do:

Name the element (we named our two elements search_box and search_button, respectively.) You do that by passing the name as a symbol for the first argument.
Provide instructions on how to locate it. What is it that uniquely identifies that element on the page? Often, it’s just one thing like the id (:yschsp) or the CSS class (ygbt). But what if this wasn’t the case?. Suppose there was another element on the page that had the same class as the button and no identifier. In that case, specify another attribute to make it clear what to locate.

Here’s the code I would use:


 button(:search_button, class: 'ygbt', value: 'Search the Web')

Yes, you can specify more than one attribute to help locate an HTML element. Our button doesn’t need this modification though because the ygbt class isn’t being used to identify anything else on the page.

Elements Defined, Now What?

This is where the magic happens! After modeling/defining our text field and button, we created five different methods. Yes, five of them! For example, after typing this:


text_field(:search_box, id: 'yschsp')

Page Object will create three separate methods:


search_box              # returns the value in the text field
search_box=            # sets the value of the text field
search_box_element      # returns a reference to the actual HTML element

The whole point of defining and modeling HTML elements with calls to text_field and button is to create the above methods. Those newly created methods allow us to work with and manipulate the actual element.

For more information on what methods get created when you define an element, consult this Wiki page. In the next section, we’ll play around with them a bit.

Use Your Page Object

Page Object needs a Watir/Selenium browser instance to function. The page-object gem doesn’t give instructions to the browser directly, for more information see this article). I’ll be using Watir through this article. Let’s define a new browser instance first:


require 'watir-webdriver'
browser = Watir::Browser.new :chrome

We forgot to add one essential thing to the definition of our SearchPage. Can you guess which one? Sure, we have all the elements defined to manipulate the page. But what page exactly? Where is that page located? To tell Page Object the URL where the page can be found, add the page_url method directly inside theSearchPage class with an argument specifying the actual URL:


page_url('https://web.archive.org/web/20060204194204/http://search.yahoo.com/')

If you’re not sure how all of this comes together, the complete code is below.

We now have all of the necessary elements to get started! To start using the Page Object class, just create an instance of it and specify the browser instance you want to use as a parameter:


search_page = SearchPage.new(browser)

You cannot use your Page Object class without connecting it to a browser. It must have something to act upon.

Here’s the complete code. I added a few sleep methods so you can clearly see what’s happening. See if you understand it fully and try running it on your computer before continuing with this article:


require 'page-object'
require 'watir-webdriver'
class SearchPage
  include PageObject
  page_url('https://web.archive.org/web/20060204194204/http://search.yahoo.com/')
  text_field(:search_box, id: 'yschsp')
  button(:search_button, class: 'ygbt')
end
browser = Watir::Browser.new :chrome
search_page = SearchPage.new(browser) ; sleep 5
search_page.goto; sleep 5
search_page.search_box = 'I love SitePoint!'; sleep 5
search_page.search_button

Since we’re dealing with an Archive.org page here, you won’t get the actual search results once your script presses the “Search the web” button. I encourage you to modify it for the original [https://search.yahoo.com/] and get it working after finishing this article.

You already know where the search_box= and search_button methods came from, but what about goto? goto is also a generated method. It was created when we called page_url to specify the URL.

text_box, page_url and button are called accessors. Once you invoke them directly inside your class, page-object automatically creates relevant methods you can use when you create an object, as we saw above.

Using page_url creates the goto method that takes you to the page. Using text_area creates four different methods. A button definition creates three.

Most of the time, you’ll want to goto the page immediately when you create the Page Object instance (search_page), though. This is easily accomplished by adding another argument to the constructor:


search_page = SearchPage.new(browser, true)

If the first argument was the browser instance, the second is whether to visit the page specified in page_url immediately (by calling goto for you in the background).

Grouping Towards a Common Purpose

So far, I am trying to demonstrate how Page Object works. The code above could be, in fact, be written easily in Watir and it would be approximately the same length. The magic of this pattern comes to life when we group everything together in a method towards a common purpose (looking up a keyword). Let’s add this method to the SearchPage class definition:


def look_for(keyword)
    self.search_box = keyword; sleep 5
    self.search_button
end

And then do:


search_page = SearchPage.new(browser, true)
search_page.look_for('I love SitePoint')

This is way cleaner than what Watir offers. A note on why it’s a good idea to include self inside look_for: Many of the methods generated by page-object are assignment-type methods, like searchbox =. When calling these types of methods inside an instance method like look_for, if you don’t add self before them, Ruby will interpret them as local variables and not as method calls. For example, try predicting what this code will print to the console:


def hello=(arg) # assignment method
  puts arg
end
def hi(arg) # ordinary method
  puts arg
end
def integrate
  hello = 'article' # it's supposed to call the assignment method
  hi('blog')
  self.hello = 'ruby'
end
integrate

The output will be ‘blog’ and ‘ruby’, ‘article’ will not print at all.

Uh, Oh…Our Search Page Changed

Let’s fast forward our search page to 2013, when Yahoo Search had a major redesign. Here’s the URL. Put this link as a new argument to page_url and try running your code. You should get this:


unable to locate element, using {:class=>"ygbt", :tag_name=>"button"} (Watir::Exception::UnknownObjectException)

With the new design, Yahoo changed its search button, no longer identifying it with ygbt as the class name. The class is now called sbb-sd. To make our code work again, just change the line where you define the button in your class:


button(:search_button, class: 'sbb-sd')

I want you to imagine what would the process be if you were not using the Page Object pattern. What if you had 5 ordinary methods, each utilizing this button, and each of them used this button towards a specific purpose. You’d have to change your code 5 times! Sure, you could implement your own pattern, but see how much more complex it is to do manually. Also, this is using Watir, the simplest possible automation tool using Ruby, one of the languages with the cleanest syntax. Imagine using Selenium and Java to implement your own Page Object pattern.

Here’s the complete code that includes the new argument for page_url and the look_for method:


require 'page-object'
require 'watir-webdriver'
class SearchPage
  include PageObject
  page_url('https://web.archive.org/web/20130307071919/http://search.yahoo.com/')
  def look_for(keyword)
    self.search_box = keyword; sleep 5
    self.search_button
  end
  text_field(:search_box, id: 'yschsp')
  button(:search_button, class: 'sbb-sd')
end
browser = Watir::Browser.new :chrome
search_page = SearchPage.new(browser, true)
search_page.look_for('I love SitePoint')

Tips on Using Page Object

Here are some useful tips that might come in handy as you use this gem:

Accessing the Browser Instance

You can access the browser instance from your PageObject class either via @browser or using its getter attribute browser, for example:


def look_for(keyword)
  self.search_box = keyword; sleep 5
  self.search_button
  puts browser.text
end

Navigating the Browser Without the Browser Instance

You have plenty of methods to go forward, back, refresh, clean cookies, etc. using the browser instance. I’ve added a 2 second pause between each action so you can see what’s going on more clearly:


def look_for(keyword)
  self.search_box = keyword
  self.search_button
  back; sleep 2
  forward; sleep 2
  clear_cookies; sleep 2; puts 'Cookies cleared!'
  puts current_url; sleep 2
  puts element_with_focus; sleep 2
  refresh; sleep 2
  save_screenshot('screenshot.jpeg'); sleep 2
end

Handling Alerts

Page Object allows you to skip an alert box, getting only its value. Take this page, for example, and try to click the “Try it now” button. You should get “Hello from JavaScript!”. Now, try this code:


class AlertPage
  include PageObject
  page_url('https://web.archive.org/web/20150618033134/http://javascripter.net/faq/alert.htm')
  def get_alert_text
    message = alert do
      self.try_it_now
    end
    p message
  end
  button(:try_it_now, value: 'Try it now')
end
browser = Watir::Browser.new :chrome
alert_page = AlertPage.new(browser, true)
alert_page.get_alert_text

Notice how the alert box doesn’t appear at all! For doing the same with confirm/prompts, see this article.

Watir has pretty nice methods for dealing with alert boxes & Ajax which I find to be more convenient.

After Initialization

Do not use initialize if you want some code to execute when you create your object. PageObject makes use of this method and things will break if you try to overwrite it. Instead, use initialize_page. Try adding this method to your SearchPage class to see what I mean:


def initialize_page
  puts 'I am being initialized!'
end

How to Learn More

Here are more useful resources where you can learn about Page Object:

The gem’s Wiki
The author’s blog (the comments also contain lot of useful material)
This article :) Hope you found it useful!

Give the Page Object pattern a try the next time you need to use a web page in your Ruby. I think it will save you much time and aggravation.

Frequently Asked Questions (FAQs) about the Page Object Model

What is the main advantage of using the Page Object Model in web development?

The Page Object Model (POM) is a design pattern that makes it easier to maintain and reduce duplication in your test code. It does this by encapsulating information about the elements on your application’s user interface page into an object within your test script. This means that if the UI changes, the fix need only be applied in one place. This makes your test code cleaner and easier to understand.

How does the Page Object Model differ from other design patterns?

Unlike other design patterns, the Page Object Model focuses on the UI. It provides a simple API to the test cases, hiding the details of the UI structure and the locators used. This makes the test cases easier to write and maintain, as they don’t need to know about the UI’s structure.

Can the Page Object Model be used with any testing framework?

Yes, the Page Object Model is a design pattern, not a framework. It can be used with any testing framework that supports object-oriented programming, such as Selenium, JUnit, TestNG, etc.

How does the Page Object Model improve the readability of test cases?

By encapsulating the UI details in the page objects, the test cases become more focused on the workflow of the application under test. This makes them easier to read and understand, as they are not cluttered with UI details.

What is the role of the Page Factory in the Page Object Model?

The Page Factory is a class provided by Selenium that supports the Page Object Model. It provides methods to initialize the elements of the page object and provides a way to access the page’s elements without exposing the underlying implementation details.

How does the Page Object Model support parallel testing?

Since the Page Object Model encapsulates the UI details in page objects, these objects can be used by multiple test cases at the same time. This makes it easier to write and run tests in parallel, improving the efficiency of your testing process.

Can the Page Object Model be used for mobile application testing?

Yes, the Page Object Model can be used for mobile application testing. It can be used with any testing framework that supports object-oriented programming, including Appium for mobile applications.

How does the Page Object Model handle dynamic elements?

The Page Object Model can handle dynamic elements by using dynamic locators. These locators can be parameterized to find elements based on their changing attributes.

How does the Page Object Model support data-driven testing?

The Page Object Model supports data-driven testing by separating the test data from the test logic. The test data can be stored in external sources like Excel, CSV, or a database, and can be fed into the test cases using the page objects.

How does the Page Object Model improve the maintainability of test cases?

By encapsulating the UI details in the page objects, any changes in the UI need only be made in the page objects. This makes the test cases more robust and easier to maintain, as they are not affected by changes in the UI.