Deepstream: an Open-source Server for Building Realtime Apps

Wolfram Hempel
Share

Open Source Week

It’s Open Source Week at SitePoint! All week we’re publishing articles focused on everything Open Source, Free Software and Community, so keep checking the OSW tag for the latest updates.

Elton, the deepstream mascot

Elton, the deepstream mascot

Realtime apps are getting really popular, but they’re also hard to build. Wolfram Hempel introduces deepstream, an open-source server he co-founded to make data-sync, request-response and publish-subscribe a whole lot easier.

The Rise of Realtime Apps

Realtime is eating the world! Or at least it’s taking bigger and bigger bites. Whether it’s collaborative editing in Google Docs, chatting via Facebook messenger, financial trading on the move, IoT controls, live dashboards or multiplayer gaming — users are increasingly expecting to see changes happen as they happen.

Even traditionally static sites like social networks or forums are starting to abandon the refresh button and instead stream updates directly into your feed.

But as popular as realtime apps are, they’re also hard to build. While it’s possible for smaller projects or POCs to introduce realtime features just by adding a pinch of Socket.io, large-scale use cases require a fundamentally different architecture. Concepts like concurrent connections, failover, streaming data-consistency, persistence, encryption and permissioning all have to be woven into the fabric that powers this new generation of apps.

One industry where I learned this only too well is investment banking. The servers that power the myriad of flashing screens on the world’s trading floors are monolithic beasts, complicated and breathtakingly expensive. But they are fast. Very fast. And they’ve got something else right: they use a concept called “data-sync”.

Realtime Concepts

If you’ve already built a realtime app, chances are you’ve used a pattern called “publish-subscribe” or “pub-sub” for short: subscribers listen for events on a channel and others publish these events. It’s an efficient mechanism for many-to-many communication that’s supported by a wide range of technologies and services such as Socket.io or SocketCluster on the open source side, or Pusher, PubNub or Ably in the PaaS space.

But there’s one crucial thing that pub-sub can’t provide: state. Pretty much every app has some state — data that needs to be created, read, updated and deleted, but pub-sub only delivers one-off messages that vanish immediately. A common workaround is the use of events as update notifications which in turn prompt the client to retrieve the latest state via traditional request-response. But that’s complicated, prone to inconsistencies and, most importantly, slow.

This has led to the increasing move towards data-sync, an approach that combines statefulness with realtime updates. Data is persistent as well as kept in sync between connected clients and backend processes.

Technologies that support this are much rarer. On the open-source side there used to be the now discontinued horizon.io; in the PaaS space there’s Google’s Firebase.

To fill this gap, we’ve started deepstream.io. Our aim was to create an open-source server with the same performance and versatility that financial trading or multiplayer gaming systems deliver, but in an open and extendable way that makes it easy to use for any kind of app.

Deepstream.io

Deepstream is a new type of server that handles realtime data at scale. Its installed similar to a database or an HTTP server. End users and backend services connect to it via lightweight SDKs that come in a range of different programming languages, such as JS/Node, Java/Android, or Swift/ObjC.

Deepstream Mascot Elton typing

It provides data-sync as well as pub-sub and classic request-response and caters for a wide range of functional requirements such as failover, permissioning, encryption, consistency and conflict resolution.

Deepstream is designed to thrive in open-source ecosystems, and comes with a range of connectors for popular databases, caches or message busses.

But most importantly: it’s scalable, reliable and very fast.

Using Deepstream

All of this sounds good and well — but how does it actually work? Let’s touch on the main points quickly.

Installation

Deepstream comes as Mac and Windows executable, yum and apt package or Docker image, all of which can be found on the install page.

various deepstream distributions

Configuration

Every aspect of the server can be configured in a file called config.yml, located in either /etc/deepstream/conf/ on Linux or in the conf directory on Windows or Mac

Starting the server

The server is started either by running deepstream start on the command line or by double clicking the executable.

Getting a client SDK

Connecting to deepstream requires an SDK for the given programming language. For browsers and Node, for example, this can be installed via npm install deepstream.io-client-js.

Connecting to the server

The simplest way to connect is by calling var client = deepstream('localhost:6020').login(). There’s a lot more to add to this line, such as passing client options, authentication parameters or waiting for a callback after login, but let’s just leave it at that.

Using data-sync

Deepstream’s data-sync uses a concept called “records” — JSON documents that can be manipulated and observed and are synced across all connected clients as well as persisted on the backend. This sounds more complicated than it is. Records are identified by a unique name and created or loaded on the fly:

pizzaGuy = ds.record.getRecord( 'driver/14' )

The value of a record can be set like so:

pizzaGuy.set({
  name: 'John Doe',
  position: { x: 4234, y: 2454 },
  speed: 22
})

… or partially, like so:

pizzaGuy.set( 'position.x', 4244 )

Similarly, other clients can subscribe to changes to the entire record:

pizzaGuy.subscribe(( data )=>{
  //...
})

… or to a path within it:

pizzaGuy.subscribe( 'position', updateMapMarker )

Using events

Events are deepstream’s pub-sub mechanism. They provide ephemeral many-to-many messaging. Every client can subscribe to an event:

ds.event.subscribe( 'something-happened', data => {})

… or emit it:

ds.event.emit( 'something-happened', { size: 'big' })

Using RPCs

Remote procedure calls are deepstream’s mechanism for request-response communication. Deepstream routes requests between provider and requestor, manages failover, retrying and data-serialisation.

Processes can register themselves as RPC providers:

ds.rpc.provide( 'add-two', input, ( response ) => {
  response.send( input + 2 )
})

… and request them:

ds.rpc.make( 'add-two', 5, ( err, result ) => { /* result = 7 */})

Authentication and permissioning

Deepstream offers a range of different strategies to authenticate incoming connections, such as via config files or http-webhooks. Every incoming request is authenticated using a realtime permission language called Valve.

record:
  #an auctioned item
  auction/item/$sellerId/$itemId:

    #everyone can see the item and its price
    read: true

    #only users with canBid flag in their authData can bid
    #and bids can only be higher than the current price
    write: "user.data.canBid && data.price > oldData.price"

    #only the seller can delete the item
    delete: "user.id == $sellerId"

Adding connectors

It’s easy to add databases such as Mongo, Rethink or Postgres, caches like Redis, Memcached or Hazelcast, or Messaging Systems such as RabbitMQ or Kafka to deepstream using connectors. All connectors are installed via the commandline — for example, by running the following:

deepstream install cache redis

Putting It All Together

To summarize, deepstream is a universal, scalable and performant realtime server that’s usable as a backend for use cases ranging from CRUD applications to demanding messaging apps, realtime dashboard or even multiplayer games. It’s robust, secure and provides all the features necessary to run large-scale realtime apps in production.