Scaling Your Web Application: VPS vs PAAS

Andy Hawthorne

You have deployed your Rails application to your VPS, or to Heroku. Traffic starts to arrive as interest in your application builds. Next, it goes viral. Then your application crashes…

Much has been discussed about the scalability or otherwise of Rails. Conventional wisdom suggests that it’s not the web application itself that crashes due to high traffic volumes, it’s simply a case of a lack of hardware resources. So, what options are there to scale your app on a VPS? And what about for Heroku? Let’s see…

First, What Rails Can Do To Help?

Best advice suggests that we should turn on page caching for our Rails apps wherever we can. We have an article here on Rubysource that provides plenty of information about HTTP caching.

The bottom line is: page caching helps with server load because the normal cycle of request/response that runs through like this:

  • Request from the client
  • Apache passes the request to say, Mongrel
  • Mongrel sends the response to Apache
  • Response is sent to the client

Mongrel can handle around 20-50 requests per second, which is good for 2 million hit a day, or thereabouts. By switching on page caching, Apache can serve cached pages (usually) from the public directory of our app without the need for a further request to Mongrel.

There are three options that we can use with Rails:

1. Page Caching

With page caching switched on, the web server (Apache or nginx) can avoid the Rails stack entirely. That will only work for some scenarios – it couldn’t be used for pages that need authentication for example.

2. Action Caching

Action caching solves the problem mentioned above, by sending requests to the Rails stack. Action Pack configured with before filters can then be run before the cache is served. That is a powerful feature because it means that authentication can run before a page is served from a cached copy.

3. Fragment Caching

Fragment Caching allows a section of a view to be enclosed in a cache block, and served out of the cache store for each request.

For example, if you had part of a view that contained a list of blog posts you want to cache that list:

[gist id=”3702298″]

It is also possible to cache multiple fragments by setting the :action_suffix .

It’s good practice to utilise these options generally, but when we start to investigate scaling options, you’ll see that there is another reason too.

Scaling – Hardware Considerations

There are two aspects to consider when considering scalability issues:

  1. Capacity – To achieve effective load balancing, the best method is to increase capacity by adding hardware. As load increases, adding extra hardware will help to distribute the load and maintain stability for end users.
  2. Redundancy – In this context redundancy means that, if a server fails, it doesn’t bring down the whole system. Instead, it just reduces the capacity by the amount that it added in the first place.

This combination of capacity adjustment and redundancy is usually achieved through load balancing.

Scaling an App Deployed on a VPS

So, you have deployed your app on a Virtual Private Server(VPS). Something like Linode, or Webbynode for example. Scaling options are available for most VPS hosting services. Linode has a comprehensive set of load balancing options, so let’s see what can be done.

Getting Started

Linode uses what it calls NodeBalancers that listen on a public IP address for incoming connections. You can apply sets of rules that are used to configure which backend node (there can be more than one) the connection gets sent. NodeBalancers can also assess the incoming request and make decisions based on what it contains.

There is one complication though. If your app uses sessions they will need to be directed to to the node that the app is running on. The Linode NodeBalancers solve this problem because they can be configured to ensure the same client lands on the same backend node.

An Important Caveat

Most VPS hosting will offer some form of scaling, perhaps not to the same depth that Linode does, but it will be there. Be careful of cost though. For example, for NodeBalancers to work you have a static IP address. That involves adding an admittedly small cost to your monthly bill. The real hit comes from applying a NodeBalancer. It costs an extra $19.95 a month to switch one on. That’s on top of your normal monthly fee.

The Steps For Setting Up a NodeBalancer

Assuming that you have purchased a static IP address, and added a NodeBalancer to your account, you can now configure it. The basic steps are:

  1. You choose a port for the Balancer to listen on. Port 80 is fine to pick up regular web traffic.
  2. In the configuration screen, there are a number of options, such as session stickiness to handle the session problem described earlier. You can also set a check interval which determines the number of seconds between health check probes. These options are clearly explained on the configuration screen.
  3. You then point the backend node to the private IP address for your web server
  4. After everything has updated, you should be able to go to the the NodeBalancer’s IP address and see your web app as before.

That’s the procedure to add one extra backend node. If your app needs more, then you will need to purchase another one and follow the same routine.

You can see that this could get expensive quite quickly, but it is also worth pointing out that you would need to be experiencing some serious traffic to consider buying extra NodeBalancers.

What About Heroku?

Heroku offer some nice scaling options that can be defined as two choices:

  • Web dynos
  • Worker dynos

Web dynos allows your app to handle more concurrent HTTP requests, while worker dynos allow you to process more jobs concurrently. They also offer various database solutions to assist with caching of queries too. If you are experiencing a slow down that is coming from data related processing, adding more web dynos could make the problem worse. Optimising queries would be a first step to solving that problem.

You can apply scaling options from the command line. For example, to apply web dynos you simply do:

heroku ps:scale web=2 

Or to apply workers and well as web processes:

heroku ps:scale web=2 worker=1 


This gets expensive quite quickly too. For example, to have 2 web dynos and 1 worker, it will cost you $71 per month. Add a 400Mb database cache, and that’s a further $50 per month.

Again, that cost might be considered minimal if you have a production web application that is serving it’s purpose well.


If you have deployed to a VPS, scaling your app is made easy via the manager or admin control panel that VPS services normally provide. It is really a matter of adding the extra resources and paying for them. The configuration options are well explained (certainly in Linode’s case), so increasing resources is not the technical issue it once was.

Cloud based services such as Heroku and Webbynode also make it easy to scale up (or down), based on what your traffic monitoring is telling you. So the only challenge now, is to drive that traffic in!

CSS Master, 3rd Edition