Introduction to Event Machine
Event Machine is an awesome library that brings event-based, non-blocking IO to Ruby.
Instead of saying “write to this file”, and then waiting for that procedure to end, you say “write to this file, and call this function when you’re done”, and, your program moves right along. The advantage being that there is no waiting involved and concurrency becomes much easier.
Event Machine basically operates on a thread pool. Whenever it needs to, it uses some threads from the pool, and, once it is done, it puts them back. Using these threads, Event Machine makes it very easy to develop scalable, concurrent applications.
Basically, in very simplified (maybe too simplified) terms, it is a bit like a Node.js for Ruby.
With a bit of learning, Event Machine is an incredibly powerful tool to develop heavy-duty (e.g. real-time) services.
Any place where one needs to write a server to handle many connections at once, write to the filesystem and move on or timely release different processes Event Machine works excellently.
So, let’s get started with Event Machine.
As we all know, Ruby has the wonderful RubyGems, which makes everything incredibly simple. Just plug this into a Terminal:
If it chokes on native extensions, make sure you have build tools (i.e. gcc) installed, and, read the error message to check if you’re missing any requisite libraries.
The humble echo server (a server that sends back the exact same text you send it) is a bit like the “Hello, world!” of socket code – it gets you up to speed with the setup.
Here it is with Event Machine:
That looks kind of scary, but, once broken down, it is quite digestible. We’re creating a class Echo, which is a child of EM::Connection. Then, we override the method
receive_data, which is a callback method. Callback methods are invoked when something happens; in this case
receive_datais called when the server receives data.
receive_data is passed in the data as an argument, and we then use the method
send_data (part of EM::Connection) to send that data back to client over the same EM::Connection.
Then, we use the
EM.run method. This starts Event Machine, which is basically a loop that fires events. It takes a function block before starting.
EM.start_server starts an EventMachine server, using the Echo class as an EM::Connection.
To test it out, we can connect to localhost over port 1337:
Type something in, hit enter, and, you should see it echo right back at you.
Now, much more interestingly, open up two Terminal’s. Type in the same command as you did above (i.e. connect to the echo server). Type something in on both connections, and, you should get a response on both.
The server can handle multiple connections at once!
By using its thread pool, Event Machine has magically made everything concurrent.
If you’ve written threading code, you know how difficult that used to be.
There are many other methods like
receive_data to override (each of which are fired when certain events occur), the main ones being:
connection_completed– called after the connection has been completed
post_init– called before the connection has been established
unbind– called after the client disconnects
They aren’t very difficult to use; in that regard, the Event Machine documentation is awesome. What can be rather difficult, however, is understanding how the reactor (i.e. event-based) model works, which is where most mistakes happen.
You should never, ever have blocking code in event callbacks.
Blocking code consists of methods that might not return immediately; for example, opening and reading a file is a good example.
If you do, you’re killing the purpose of an event-based system (often called the reactor pattern), because your program will wait for something to occur. This is precisely what we are trying to avoid by using the reactor pattern.
Usually, for such tasks, Event Machine has its own, non-blocking version that hooks into the Event Machine runtime and fires an event once it’s done.
Event Machine also handles timers. You can assign callbacks to set time frames.
Here’s an example:
This time, we don’t even need a class, because we don’t have a connection. The code is quite self-explanatory: Add a periodic timer to the event loop and it runs a block every time the event is fired. The block simply prints “time elapsed” to the screen.
Running this should give an output of “time elapsed” every second.
But, what if we wanted to stop this after ten seconds? Here’s the code:
This time around, we’re using a new facility called add_timer. Instead of periodically firing events for time elapses, it only fires the event once, i.e. when the first 10 seconds elapse.
Say you want to do something in the background of EM; something that doesn’t affect the server’s clients immediately.
For that, you can use a defer. With a defer, you can push off a task into the background, so that it won’t affect the running of Event Machine.
An example will clear it up:
(Note: you might have to check the /var/log/kernel.log path depending on your system. The path just needs to point to some large file)
IO.readlines is a blocking call, i.e. we would have to wait for it to end before moving on. But, using EM.defer, we are able to keep our periodic timer just the way it should be.
Spawned processes are concurrency mechanisms just like defers, but, have some special qualities to them.
Defers run as soon as we say “EM.defer block“, on the other hand, spawned processes hang around and wait for a message and then they execute.
So, the spawned process executes when it is notified with a message.
Think about this for a second. We’re working with complete thread mechanism here. Passing messages, handling timers, using background threads, handling multiple connections and it’s this simple. For example, the spawned process example would have taken days of work in C – Event Machine, I applaud you.
I hope that you enjoyed learning about Event Machine and that the knowledge will come in handy.
I’d love to hear your thoughts about the article in the comment section below.