Rack for Middlewares
In the previous article we thinly disguised a Rack tutorial as a comparison of PHP and Ruby. The aim of the article was to point out that most PHP developers start out on a pretty well thought out learning trail. We head to “hello world”, breeze by some basic HTML interlaced with PHP, ascend to separating views and business logic then end up at using a suitable community driven framework. However, when it comes to learning Ruby we forget all that, grab Rails and start learning.
It says a lot about Rails (and/or Ruby), that so many developers pick up the language along the way. A lot of great developers have spent hours abstracting the real grunt work away in Rails, and as amazing as that is, we sometimes need to get our hands dirty before we can cement our understanding.
I made the point that Ruby, on it’s own, can be painfil when developing for the web. The consequence is the bypassing of the good learning curve we have with PHP. Unlike PHP, we cannot just lace HTML with ERB put it on a web server and expect it to work. However, we can use Rack and achieve some of the immediacy we get with PHP scripts while abstracting away enough of the HTTP protocol to keep us sane.
More Than Mongrels and Unicorns
The previous article concluded by mentioning the mystical phrase “middleware”. So, just what is a middleware? There is a great explanation on Stack Overflow, but in this context we are going to use the 50% correct explanation. It’s a piece of code that sits between the server and application allowing us to filter certain responses and requests.
To give you a bit more more understanding, Rails, Sinatra Merb etc. are all built on Rack. Rack, apart form giving us a generic interface for our sever (Thin, Mongrel, Unicorns etc.), is used to create a stack like structure of components that forms the final application. We can inject mini Rack applications into this stack to perform desired tasks.
Get the Gist?
We are going to build upon this knowledge to create our first Rack middleware. The aim of the middleware is to capture responses from our application that contain GitHub Gist URLs, and automatically converts the plain text of the URL to a pretty formatted, embedded gist; eloquently described in this GitHub blog post.
To get started, we will create a skeletal Rack application. As you will no doubt remember, we need a call
method that accepts the environment hash, and it seems sensible to capture whatever application is using our middleware.
module Rack
class Gist
def initialize(app)
@app = app
end
def call(env)
status, headers, response = @app.call(env)
[status, headers, response]
end
end
end
As you can see I have namespaced the Gist application with a Rack module, this lets us call it via Rack::Gist
and of course finalizes the name of this app. So rename the parent directory from super_awesome_ninja_rack_middleware
to rack_gist
now!
Like all Rack applications, our app responds to a call
method. At the moment, this method simply captures the status, headers and body. As we are dealing with a response in this application, body
has been aptly named to response
for clarity.
We only want to parse HTML responses, so let’s make sure of that by examining the headers.
module Rack
class Gist
def initialize(app)
@app = app
end
def call(env)
status, @headers, response = @app.call(env)
if html?
#do something
end
[status, @headers, response]
end
private
def html?
@headers["Content-Type"].include? "text/html"
end
end
end
Now, we want to check through the response for any gists. Since it’s a pretty straightforward task, we will use a regular expression.
module Rack
class Gist
def initialize(app)
@app = app
end
def call(env)
status, @headers, response = @app.call(env)
if html?
parsed_response = ""
response.each do |r|
parsed_response = r.gsub(/(https://|)gist.github.com/(d+)/) do |gist|
gist = "https://" + gist unless gist.start_with? "https://"
"<script src="#{gist}.js"></script>"
end
end
response = [parsed_response]
end
[status, @headers, response]
end
private
def html?
@headers["Content-Type"].include? "text/html"
end
end
end
We are simply looking for the gist.github.com
url, adding the https://
protocol as required, then wrapping the gist url in the appropriate script tags before re-assigning the response.
Done. Well not quite.
Where Did it All Go Wrong?
So we got pretty far without having to know much about what app that will use this middleware, the HTTP protocol, and of course, we have not written a single test.
I am not talking about unit tests. Unit tests alone may not capture the fault I have in mind, if we know little about HTTP or the Rack specification.
Let’s think about the behaviour of our middleware: It is capturing the response, possibly modifying it, and passing it back in the normal flow. Looking at the Rack specification, we see that the header should have a content length. This to satisfy the HTTP specification for the header to contain the correct content length. Obviously, if the middleware modifies the response, the content length will now be incorrect.
Rack ships with the Rack::Lint
module, allowing us to verify our application meets the contract in question while in development. Sure enough, if we add tests to the middleware we receive the following error.
Rack::Lint::LintError:
Content-Length header was 81, but should be 108
This is a cheap fix. Rack is great at providing little, but enough, of everything we need. This time it is Rack::Utils
. We can include the Utils
module and then call bytesize
and set the header to the new length.
include Rack::Utils
#
# ...
#
@headers['Content-Length'] &&= bytesize(parsed_response).to_s
The method bytesize
is nothing special. Check the source and you will find it simply returns the size of the string passed to it, literally string.size
. But I like to keep all things in Rack, and bytesize
is specific to our domain, so let’s not just use response.size
.
Another useful tool in the bat (“Holy Rack, Batman!”) belt is the Rack Test gem. This compact little gem provides helpers for things like setting request headers for our tests, following redirects, and managing sessions. I find it helps me focus more on the test than the setup.
In the Wild
We have our middleware all ready to roll out. How do we do that, exactly? With Rails, as of v3, Rack middleware has became part of the core, making injecting a middleware super simple. If we want to apply it to the response of a single controller, pop the source in our vendor
or lib
directories.
class GistController < ApplicationController
use Rack::Gist
def index
# views with embedded gists
end
end
You should be familiar with the use
method here, that is plain old Rack in action there.
Of course, if our middleware is applicable to the entire application, in the config/application.rb
in the config section we can add.
module GistTest
class Application < Rails::Application
# other config
config.middleware.use Rack::Gist
end
end
Rack::Gist
is injected into our middleware stack on each request. Voila.
One other thing I like to do is package my middlewares as gems (well they are supposed to be re-usable). The source, plus tests can be found on the usual suspect.
Where to Next?
So far, we have talked about Rack and how it forms the foundations for the main Ruby frameworks we know and love. It’s more than just a server interface for our applications. We can harness it’s capabilities to add low level functionality in a far simpler, more portable way than any helper could.
I wanted to show a middleware that actually gave us something other than request stats. We could take it further and handle cases where the gist is in an anchor tag, using gems like nokogiri or hpricot. I suppose there is even the case to just do all this in JavaScript.
Regardless, what we accomplished is bread and butter for most useful Rack middlewares. A great, inspirational resource for your next Rack app is CodeRack. Pick some projects there and examine the source.
Hopefully, in the future, before you go off adding complexity to your application for certain responses (ReCaptcha springs to mind), think about how you could apply the problem down the chain in a middleware, most of the time it just makes more sense.