Introduction to MongoDB

As a PHP developer you are probably used to seeing applications and articles using MySQL or some other relational database management system (RDBMS). But in the past few years, a new kind of database has gained adoption in the software development community. This new type of database focuses on document objects rather than strictly defined records and relationships, and has been nicknamed “NoSQL.”

There are a lot of implementations of the NoSQL concept, but one of the most famous and widely used NoSQL databases is MongoDB. I think it’s one of the most interesting NoSQL databases available currently, and it’s considered by many to be one of the easiest to use (which has helped it gain widespread adoption).

In this article I’ll introduce you to NoSQL with MongoDB. You’ll learn how to install the MongoDB extension for PHP, and how to add, update, and retrieve document objects. If you’re used to working with RDBMSs like MySQL or PostgreSQL, you’ll find some of the concepts of working with MongoDB a bit strange, but you’ll soon grow to love the flexibility and power that MongoDB gives you!

About MongoDB

MongoDB is a document-oriented database and each document has its own structure. Unlike a RDBMS in which each record must conform to the structure of its table, each document in MongoDB can have a different structure; you don’t have to define a schema for documents before saving them in the database.

MongoDB groups document objects into collections. You can think of a collection as a table like you would create in a RDBMS, but the difference as I said before is that they won’t force you to define a schema before you can store something.

With MongoDB, you can embed a document inside another one, which is really useful for cases where there is a one-to-one relationship. In a typical RDBMS you’d need to create two tables and link them together with a foreign key to achieve the same result. MongoDB doesn’t support joins, which some people see as a con. But if you organize your data correctly then you’ll find you don’t need joins, which is a pro since you’ll benefit from very high performance.

It’s worth mentioning the aim of MongoDB and NoSQL isn’t to kill off RDBMS. RDBMSs are still a very good solution for most of the development world’s needs, but they do have their weaknesses, most noticeably the need to define a rigid schema for your data which is one problem NoSQL tries to solve.

Installation

MongoDB is easy to set up. You can download the compressed file of the latest version from its website and unpack it on your server. Then, create a directory for the database and run the mongod binary with the path to your database directory.

shreef@indigo:~$ tar zxf mongodb-<version>.tgz
shreef@indigo:~$ mv mongodb-<version> mongodb
shreef@indigo:~$ mkdir mongodb/data
shreef@indigo:~$ mongodb/bin/mongod --dbpath mongodb/data &

Then, you’ll need to install the MongoDB extension for PHP from PECL.

shreef@indigo:~$ sudo pecl install mongo

Enable the extension in your php.ini and restart Apache if necessary.

extension=mongo.so

Using MongoDB

Using PHP’s MongoDB extension is easy. I’ll show you how to perform the basic operations you’ll use the most, and highlight some common pitfalls you may come across.

Connecting

The first thing you’ll need is to establish a connection to the MongoDB server, which is as simple as creating a new instance of the Mongo class. By default, the new Mongo object will try to connect to the MongoDB server running on your localhost using port 27017. You can change this by passing a connection string when you instantiate the object.

<?php
// connects to localhost on port 27017 by default
$mongo = new Mongo();

// connects to 192.168.25.190 on port 50100
$mongo = new Mongo("mongodb://192.168.25.190:50100");

Next you’ll want to select the name of the database you’re going to use. If you want to access the database named blog then just select it as if it were a property of $mongo.

<?php
$db = $mongo->blog;

$db now holds a MongoDB object representing the blog database.

Selecting the collection that you want to save your objects in is very similar to how you selected the database; select the collection posts by accessing it as a property of the MongoDB object.

<?php
$collection = $db->posts;

$collection now holds a MongoCollection object representing the posts collection.

If no database or collection exists with the name you’ve provided, MongoDB will create a new one with that name once you insert your first document object. Alternatively, you can force MongoDB to create the collection earlier using the createCollection() method.

<?php
$collection = $db->createCollection("posts");

Inserting Documents

To insert a new object into the posts collection you use the MongoCollection object’s insert() method, passing in an array of data to be saved.

<?php
$document = array(
    "title" => "cat with a hat",
    "content" => "once upon a time a cat with a hat ...");
$collection->insert($document);

You can see that $document is just a simple associative array here. After you call insert(), the array is modified with the new key _id and a value that is a MongoId object instance having been added.

Array
(
    [title] => cat with a hat
    [content] => once upon a time a cat with a hat ...
    [_id] => MongoId Object
        (
            [$id] => 4ea2213af7ede43c53000000
        )
)

_id is the primary key for the document and must be unique to each document object in the collection. Generally it’s a good idea to let MongoDB assign the key since then you can be sure there won’t be any collisions.

If you try to use the same array again with insert() you’ll receive a MongoCursorException since the ID already exists already in the collection. Go ahead and try to insert the same document twice…

<?php
$document = array(
    "title" => "cat with a hat",
    "content" => "once upon a time a cat with a hat ...");
$collection->insert($document);
$collection->insert($document);

Hrm, what’s that? No exception was thrown?

You didn’t get an error because you are not doing a safe insert. The MongoDB extension for PHP performs all operations asynchronously by default so you don’t have to wait until the server responds that the data was successfully saved. This feature lets you move on to the next task without waiting for confirmation whether the data was saved or not and can be very convenient, but for important data you may want to be sure the document object was really saved and no errors happened. You can achieve this by passing an array as the second parameter of insert() with the key safe set to true.

Try it again, but this time pass the second parameter to insert().

<?php
$insertOpts = array("safe" => true);
$collection->insert($document, $insertOpts);
$collection->insert($document, $insertOpts);

Now you’ll get the exception I mentioned earlier since you are performing a safe insert and another object with the same ID already exists in the collection.

Updating Documents

If you want to modify an existing document object, you can use the save() method.

<?php
$document["author"] = "Shreef";
$collection->save($document);

If the passed array has no _id key, the document will be inserted and a primary key will be assigned. If the array already has an _id key then the document object will be updated. Like insert(), save() also accepts an optional second parameter.

The save() method is not the only way to update an existing object; you can also use the update() method. update() takes the criteria of the object you want to update as the first parameter, and the updated object as the second parameter.

Let’s say I want to find all objects whose author key equals “Shreef” and update the value to “Timothy”. I might try the following:

<?php
$collection->update(
    array("author" => "Shreef"),
    array("author" => "Timothy"));

While this works, it also has the unintended consequence of removing all other fields in the matching documents; the only fields they would have would be author! This isn’t what I wanted. To update only the value of the author key and leave the other key/values untouched, you need to use the $set modifier.

<?php
$collection->update(
    array("author" => "Shreef"),
    array('$set' => array("author" => "Timothy")));

But this still isn’t exactly what I wanted since MongoDB will only update the first matching object it finds. To update all the matching document objects, you need to pass an array as a third parameter to update() with the key multiple set to true.

<?php
$collection->update(
    array("author" => "Shreef"),
    array('$set' => array("author" => "Timothy")),
    array("multiple" => true));

Finally I’m able to update all the matching records how I wanted. The $set modifier ensures only the value of the author key is updated, and the multiple key tells MongoDB to update every matching document object it finds.

Selecting Documents

To select all the stored documents from a collection that match some criteria, you use the find() method. The method takes the criteria of your query as the first parameter, and optionally as a second parameter it can take an array of the field names to return instead of the entire object.

<?php
$cursor = $collection->find(array("author" => "shreef"));
foreach ($cursor as $document) {
    print_r($document);
}

find() returns a MongoCursor object that you can use to iterate through the results.

If you only want to retrieve one value, you can use the findOne() method which will return only an array with the fields of the first matching object.

<?php
$document = $collection->findOne(array("author" => "shreef"));
print_r($document);

Summary

MongoDB is a very simple yet powerful database that you can use if you have large amounts of unstructured data. There are many, many more features than what I have mentioned in this article, and maybe I’ll write another article discussing some of them in the future. Until then, try to follow the examples mentioned in the article and feel free to leave comments with any questions you have.

Image via graph / Shutterstock

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Kise

    Hello
    thanks for the Introduction, i have one question left, while NoSQL is helpful with storing data how you can get dynamic results such as id equal 2585
    in RDBMS you can just query the id and get the results how you can achieve that in NoSQL i saw that you munitioned each query has unique id, lets say you want to build a forum script based on NoSQL
    its seems rather hard to implement that
    thank you

  • Shane

    Lovely to see this concept finally taking hold.
    We were put through the mill and back whilst doing my software engineering degree in 1993. One of the accounting lecturers had got it into his head that you could create an object oriented database engine and convinced one of the software engineering lecturers to base his OO course on developing the concept. We used smallTalk as the implementation platform and got some good results, although of course the concept was far too big for a single semester.
    I look forward to reading the rest of your articles in this area to see how things have moved along since then.

  • http://www.zingitc.com Chris Stapleton

    Nice intro Ahmed. Do you have some real life examples of where NoSQL dbs are better than a RDBMS?

  • http://abbyandwin.net sherwin

    nice article. i started to look into it further but the speed bump i’m running into is that i can’t get it installed or use it in a shared hosting environment :( guess i’ll have to fire up a VM and play with it in there.

  • http://raisul.wordpress.com Raisul kabir

    @Chris, there are many example. Like Facebook chat uses nosql.
    Thanks Shreef for the beautiful article. I feel one thing needs to be added that why nosql is so important now – that is the need for large data handling and spreading over cloud. I remember few years ago we were struggling with a large databases and too much hit. We wanted to distribute in multiple servers but since we had so much relationship, distributing in multiple servers were very difficult. If the app was in nosql, it would have helped a lot. As far as I understand, please correct me if I’m wrong.

  • http://www.farinspace.com Dimas

    Great intro article, I would love to read more about relational and one-to-many relationship concepts in practice.

  • http://spf13.com Steve Francia

    @shreef, thanks for the great post. Another good resource is the presentation I gave at ZendCon 2011 on PHP and MongoDB. http://spf13.com/post/mongodb-and-php-at-zendcon-2011

  • http://games.webblocks.nl Rudie

    Before running

    sudo pecl install mongo

    I had to install php5-dev:

    sudo apt-get install php5-dev

    because without it I didn’t have phpize. Thanks Google!

  • Nick Shaw

    Nice article, good introduction to mongoDB. I think its worth mentioning that you are limited to 2.5g of storage if running on 32bit OS:
    http://blog.mongodb.org/post/137788967/32-bit-limitations

  • John

    Hi,

    If the unstructured data is like a string containing all values(not in order), will it be useful to use MongoDB. For e.g., str_1 = (audi, A4, 2000, Diesel, Red) and str_2 = (2002, Blue, Petrol, Toyota, vista) etc. Now, from this type of several strings are stored in the MongoDB database as documents(not as key/value format, just single strings), how will it be possible to perform analytics? Or will it be even worth to do so using MongoDB?
    Please reply on my email.

    Thanks,
    John

  • http://www.burclar.org Burçlar

    hey guys do you know how could I remove mongo id from results ? I need to ignore it.

    • http://shreef.com Ahmed Shreef

      the following line creates a new collection without IDs automatically added to new documents
      db.createCollection(“noautoid”, { autoIndexId: false })