Using OrientDB with JRuby

Glenn Goodrich
Ruby Editor

ruby_orient_sun_logo

This is my second post in the series about OrientDB and Ruby. The first post did not really cover anything Rubyish, which is what this article will remedy.

When I found OrientDB, I found the Language Bindings page and was encouraged that there were 3 Ruby entries in that table: OrientDB-JRuby, OrientDB Client, and OrientDB4R. At the time, none of them had been updated recently.

OrientDB4R has since been updated to OrientDB 1.5, but it only uses the REST API. We want a JRuby solution for performance reasons, so the REST API is not really an option.

OrientDB Client uses OrientDB 1.0.0.rc2 and hasn’t been updated for a year.

OrientDB-JRuby (by Adrian Madrid) sat at OrientDB 1.3.0 (OrientDB is currently about to release 1.6.0) and had not been updated in months. It is a simple JRuby wrapper around the OrientDB Java API. If you haven’t ever used JRuby to wrap a Java library (I hadn’t), then it’s as simple as requireing the jars and BOOM!, you can call the methods in Javaland from Ruby. The github wiki has a great introduction on just how simple it is.

Upgrading OrientDB-JRuby to 1.5.0 (we actually moved it to 1.5.1 yesterday :)) was as simple as replacing the 1.3.0 jar files with their later versions, along with some of the dependencies of the Java API. Let’s take a brief moment to talk about some of those dependencies.

Tinkerpop, Blueprints, Pipes, and Gremlin

You probably think I just named the Teletubbies, right? Nope. Tinkerpop is a collection of Java libraries the touts itself as “Open source software in the graph space”. It is comprised of:

  • Blueprints – A “property graph” model interface. This is the like JDBC, but for graph databases. Blueprints supports the other Tinkerpop libraries.
  • Pipes – Pipes provides a pipeline approach to querying the graph database. Pipes provide the objects used to transform and filter graph queries and traversals. It is, in effect, used by Gremlin to provide a powerful way to query the graph.
  • Gremlin – Gremlin is “a graph traversal language”. Think of Gremlin as a fliunt interface DSL for walking the graph. Most graph queries are traversals (“start at this vertex and walk the relationships based on some filter”) and Gremlin is an attempt to standardize the language.

There are other libraries in the Tinkerpop stack, but we do not use them so I am not going to discuss them. Also, there are Tinkerpop implementations for Neo4j , Dex, and other Graph databases.

As part of my upgrading the OrientDB-JRuby gem, I also had to replace the various Tinkerpop jars (OrientDB-JRuby used the Blueprints and Pipes jars already) and add the Gremlin jars. The addition of Gremlin immediately gave the OrientDB-JRuby jar the ability to create powerful query traversals, which we will see in a bit.

Finally, it’s worth noting that OrientDB has its own Java API, separate from the Tinkerpop implementation. The OrientDB-JRuby gem also wraps this API, and it provides some functional bits that are not available in the OrientDB Tinkerpop implementations. As much as possible, I will stick to the OrientDB Tinkerpop APIs, but if I have to drop into the core API, I’ll make sure you know it.

Prepare Your Demo Area

For this demo, you’ll need to install the orientdb gem from the github repository. I’d recommend creating a orientdb-sitepoint directory. Also, this gem requires JRuby, so use RVM or rbenv and install JRuby 1.7.4. Make sure you’re in your orientdb-sitepoint direction and create a Gemfile with the following:

source "https://rubygems.org"

gem 'orientdb', github: 'aemadrid/orientdb-jruby'

A quick bundle install will bring down OrientDB-JRuby.

You’ll need to have an OrientDB server running, so refer to my first article for how to set it up.

OrientDB-JRuby Source

As I said, OrientDB-JRuby simply wraps the OrientDB core and Tinkerpop APIs. If you browse the source on Github, you can see some of the convenience methods it supplies, along with some of the Ruby constants it creates based on the Java namespaces. A look at the constants.rb file, for example, demonstrates the Ruby constants provided by the gem.

module OrientDB
  CORE = com.orientechnologies.orient.core
  CLIENT = com.orientechnologies.orient.client

  ClusterType            = CORE.storage.OStorage::CLUSTER_TYPE
  DocumentDatabase       = CORE.db.document.ODatabaseDocumentTx
  DocumentDatabasePool   = CORE.db.document.ODatabaseDocumentPool
  DocumentDatabasePooled = CORE.db.document.ODatabaseDocumentTxPooled
  GraphDatabase          = CORE.db.graph.OGraphDatabase
  OTraverse              = CORE.command.traverse.OTraverse
  Document               = CORE.record.impl.ODocument
  IndexType              = CORE.metadata.schema.OClass::INDEX_TYPE
  OClassImpl             = CORE.metadata.schema.OClassImpl
  LocalStorage           = CORE.storage.impl.local.OStorageLocal
  LocalCluster           = CORE.storage.impl.local.OClusterLocal
  PropertyImpl           = CORE.metadata.schema.OPropertyImpl
  RecordList             = CORE.db.record.ORecordTrackedList
  RecordSet              = CORE.db.record.ORecordTrackedSet
  Schema                 = CORE.metadata.schema.OSchema
  SchemaProxy            = CORE.metadata.schema.OSchemaProxy
  SchemaType             = CORE.metadata.schema.OType
  SQLCommand             = CORE.sql.OCommandSQL
  SQLSynchQuery          = CORE.sql.query.OSQLSynchQuery
  User                   = CORE.metadata.security.OUser
  RemoteStorage          = CLIENT.remote.OStorageRemote

  #Blueprints
  BLUEPRINTS = com.tinkerpop.blueprints

  #Gremlin
  Gremlin = com.tinkerpop.gremlin.java

  OrientGraph = BLUEPRINTS.impls.orient.OrientGraph
  Conclusion = com.tinkerpop.blueprints.TransactionalGraph::Conclusion

  ...
end

As we’re using the gem to access the Java API, you’ll see how we use these constants.

Create a Database

We’ll create a database for this demo called ‘sitepoint-ruby-demo’. So, fire up orientdb-console, which is a script suppled by the gem . This will launch an IRB session with OrientDB already required.

We are going to user the ‘remote’ protocol for accessing our OrientDB server, which means we have to user the OServerAdmin object to create the database. This is part of the OrientDB core Java API.

jruby-1.7.4 :001 > sa = OrientDB::CLIENT::remote::OServerAdmin.new("remote:localhost")
 => #<Java::ComOrientechnologiesOrientClientRemote::OServerAdmin:0x4473e7ae>

 jruby-1.7.4 :001 > sa.connect("user", "pass") # REPLACE user, pass with your OrientDB user and pass
 => #<Java::ComOrientechnologiesOrientClientRemote::OServerAdmin:0x4473e7ae>

 jruby-1.7.4 :001 > sa.create_database("sitepoint-ruby-demo", "graph", "local")
 => #<Java::ComOrientechnologiesOrientClientRemote::OServerAdmin:0x4473e7ae>

We should now have a database to use for our demo today. You can verify that the database was created by going to the OrientDB Studio web application and verifying that ‘sitepoint-ruby-demo’ is in the dropdown list.

db_exists

(Note: You’ll likely need to change the user and pass)

Now, connect to that database:

jruby-1.7.4 :001 > g = OrientDB::OrientGraph.new("remote:localhost/sitepoint-ruby-demo", "user", "pass")
  => #<Java::ComTinkerpopBlueprintsImplsOrient::OrientGraph:0x25f1ce01>

OrientDB::OrientGraph is the Blueprints API graph database connection.

Create Vertex and Edge Types

Recalling how OrientDB works, every graph database has V and E tables that represent superclasses for Vertex and Edge, respectively. While you can store records in these tables, it often makes more sense to subclass V or E to represent domain objects.

person_type = g.create_vertex_type("Person")
=> #<OrientDB::OClassImpl:Person super=V>

You can see that it automatically subclasses V for this vertex type. Adding properties is simple, with one gotcha: The OrientGraph object, by default, always has an open transaction. OrientDB won’t let you add properties in the middle of an open transaction, so we have to drop down to the “raw” connection and stop the transaction.

g.raw_graph.transaction.close
=> nil

person_type.add("Name", :string)
=> #<OrientDB::OClassImpl:Person super=V Name=STRING>

jruby-1.7.4 :022 > name_prop = name_prop.get_property("Name")
 => #<OrientDB::Property:Name type=string indexed=false mandatory=false not_null=false>
jruby-1.7.4 :023 > name_prop.set_mandatory(true)
 => #<OrientDB::Property:Name type=string indexed=false mandatory=true not_null=false>
jruby-1.7.4 :024 > name_prop.set_not_null(true)
 => #<OrientDB::Property:Name type=string indexed=false mandatory=true not_null=true>

Our Person now has a Name. We’ll add age and gender properties.

jruby-1.7.4 :025 > person_type.add("Age", :int)
 => #<OrientDB::OClassImpl:Person super=V Name=STRING Age=INTEGER>
jruby-1.7.4 :026 > person_type.add("Gender", :string)
 => #<OrientDB::OClassImpl:Person super=V Name=STRING Age=INTEGER Gender=STRING>

Now, we need Mom and Dad.

jruby-1.7.4 :030 > dad = g.add_vertex("class:Person", {name: 'Bob', age: 40, gender: 'M'})
=> #<Java::ComTinkerpopBlueprintsImplsOrient::OrientVertex:0x617e9189>

jruby-1.7.4 :016 > dad.id.to_s
=> "#11:-2"

We’ve added a temporary (meaning, not persisted) vertex for Dad. Remember that OrientDB has a somewhat unique ID scheme that consists of a ‘#’ followed by cluster id and record id separated by a colon. Dad’s id indicates that the default cluster id for the Person subclass is 11 and that he is not yet persisted because his record id is negative (-2).

jruby-1.7.4 :020 > g.commit
=> nil

jruby-1.7.4 :021 > dad.id.to_s
=> "#11:0"

Committing the graph persists Dad, giving him a proper id. Now, it’s Mom’s turn

jruby-1.7.4 :030 > mom = g.add_vertex("Person", {name: 'Mary', age: 40, gender: 'F'})
=> #<Java::ComTinkerpopBlueprintsImplsOrient::OrientVertex:0x617e9189>

jruby-1.7.4 :020 > g.commit
=> nil

As an exercise for you, dear reader, create the kids based on the following table:

  • Brother: Bobby, Age 10
  • Sister: Jane, Age 12

Having Relations

With the members of our little family intact, it’s time to define how they relate to each other. We could follow the same ‘create type’ route for creating edge types, defining the properties on the edge, etc. But, that would be boring and not show you how OrientDB will create types on the fly.

jruby-1.7.4 :026 > marriage = dad.add_edge("Family", mom, "Family", nil, {relation_name: 'spouse'})
Sep 01, 2013 11:49:52 AM com.orientechnologies.common.log.OLogManager log
WARNING: Committing the active transaction to Committing the active transaction to create the new type 'Family' as subclass of 'E'. The transaction will be reopen right after that. To avoid this behavior create the classes outside the transaction. To avoid this behavior do it outside the transaction
 => #<Java::ComTinkerpopBlueprintsImplsOrient::OrientEdge:0x477e4626>

jruby-1.7.4 :026 > g.commit
=> nil

jruby-1.7.4 :027 > marriage.id.to_s
=> "#12:0"

Mom and Dad are now married. (throws rice)

I am sure you noticed that WARNING when we added the edge. That is OrientDB yelling at you that it is changing the schema, which it will only do once for the “Family” edge type.

Also, the more astute amongst you may be asking why we only added the edge to Dad. Does that mean that Dad is married to Mom but Mom is not married to Dad? Well, that is a good question, and its answer boils down to how you and your code want to treat it. OrientDB (and most graph databases) are “Directed Property Graphs”, meaning that EVERY edge has a start, an end, and (as a result) a direction. You cannot have an edge that does not have a direction.

This leads us to a choice. We can either add a Family edge with ‘spouse’ for the relation_name property value OR ignore the direction of the relationship in our code. I am going to ignore it here, and presume that any OUT or IN Family edges are bi-directional. So, if Dad is married to Mom, then Mom is married to Dad.

With that covered, you now have more homework: Add Mom and Dad as ‘parent’ to Bobby and Jane, then make Bobby and change ‘sibling’. I’ll wait. When you’re done, the records in the Family class in OrientDB studio should look like this:

family

With that, our family is complete.

Querying

At the beginning of this article, if you can remember that far back, I mentioned Gremlin and Pipes. Even though OrientDB has a query language that is SQL-ish in nature, I am going to focus on using the Tinkerpop query tools. I feel like that is where OrientDB is going, so it makes the most sense.

In real life, you’ll want to test out both query types for perforamance reasons. The core query language is, sometimes, faster.

The Gremlin Wiki has some good examples of what Gremlin queries look like using the API. Read through that page, then come back here to query our family.

In order to start a Gremlin query, you’ll need a GremlinPipeline

gp = OrientDB::Gremlin::GremlinPipeline.new(g)
=> #<Java::ComTinkerpopGremlinJava::GremlinPipeline:0x5554ca01>

Remember, this is a Pipeline. It is chainable and (until you iterate/evalute it) has state through the chain. Each pipe is added to the pipeline end-to-end, so if you want to start over, you need a new pipeline.

Get Spouses

Once we have our GremlinPipeline, we have to give it a starting point. If we want to get all spouses, then our starting point is all vertices that are in the Person class. Next, we want to traverse the Family edges for each vertice and only keep the ones that have ‘spouse’ for relation_name. Finally, since we are ignoring the direction of Family, grab both vertices for each qualifying edge.

jruby-1.7.4 :078 > gp = OrientDB::Gremlin::GremlinPipeline.new(g)
 => #<Java::ComTinkerpopGremlinJava::GremlinPipeline:0x4e790cea>

jruby-1.7.4 :079 > spouses = gp.v.has("@class", "Person").outE("Family").has("relation_name", "spouse").bothV.iterate
 => [#<Java::ComTinkerpopBlueprintsImplsOrient::OrientVertex:0x42d0a46b>, #<Java::ComTinkerpopBlueprintsImplsOrient::OrientVertex:0x2c4c9e88>]

jruby-1.7.4 :080 > spouses.map{|v| v.get_property("name")}
 => ["Bob", "Mary"]

Let’s break that down.

  • gp is our pipeline
  • v sets our starting point to all vertices.
  • has("@class", "Person") adds a PropertyFilterPipe that checks each vertices @class system attribute for Person.
  • outE("Family") adds an OutEdgePipe that traverses any Family edge coming OUT of any vertex still in the pipeline.
  • has adds another PropertyFilterPipe that filters the edges for relation_name == 'spouse'.
  • bothV adds a BothVertices which collects both vertices from any edge still in the pipeline. Remember, our relation is bi-directional.
  • iterate evaulates the pipeline at that point. Calling to_a at that point would do the same thing.

After that, we just loop over the results and grab the name property to see who is married.

Looking at the Pipes javadoc shows just how many Pipes there are. Things can get very powerful (and complicated) in a hurry.
Getting used to querying with Gremlin takes a bit of time and persistence. This is very simple example. In my next article, I’ll show some more involved queries.

Homework: Figure out the pipeline to get all siblings.

Wrap it Up

There’s your whirlwind tour of using the OrientDB-JRuby gem. Obviously, you can do much more, but this is a good start.

For my next article about OrientDB and Ruby, I am planning to cover another gem called Oriented, that adds another level of abstraction to what OrientDB-JRuby offers. Until then, start applying the graph concepts to your work. I bet you’ll find it fits many more domains than you think.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

No Reader comments

Comments on this post are closed.