Programming - - By Harry Fuecks

CouchDb: document oriented persistence

If you’re looking for something “interesting” to mess around with, Damien Katz’s CouchDb project is at the point of working prototype, the server implemented in Erlang (a hot topic in some places) and a demo client application (a simple forum) in PHP.

Firing up the CouchDb server on Windows is a breeze – follow the README. PHP-wise, you need the new http extension which is most easily done on Win32 by grabbing the most recent PHP 5 release (5.1.6) and the corresponding collection of PECL modules. Alternatively the most recent XAMPP (apparently) packs the extension.

The interface between CouchDb and PHP is REST – XML + HTTP – you can also point your browser directly at the CouchDb server (default – localhost:8080) and get around with a little help from the CouchDb wiki.

What is CouchDb and why is CouchDb interesting, given relational DBs etc? To an extent it’s hard to define – best starting point is probably Damien’s discussion of Document Oriented Development. There’s a quick overview here but still it’s difficult to find a truly selling argument. How about some code instead? Here’s a snippet from the demo app (couchthread2.php), which is handling a form post;


    if ($_SERVER['REQUEST_METHOD'] == 'POST') {    
        // someone is creating a new response    
        
        // Set a field named Type to "response". This is a simple
        // way to identify the "Type" of the document. (but we could
        // have as easily used Form, Class, Category etc as a field
        // name)
        $_POST['Type'] = "response";
        
        // Add a creation date, and use a format that will sort correctly as text
        $_POST['CreationDate'] = date(DATE_ATOM);
        
        // add the threadid from query arg
        $_POST['threadid'] = $_GET['threadid'];
    
        // just take all the posted fields and save them as a new document
        if (couch_create_doc('http://localhost:8888/couchtest/', $_POST)) {    
            header('Location: ' . $_SERVER['REQUEST_URI']); // reload the page
            exit;
        }
    }

Let’s just zoom in there on that last part…


        if (couch_create_doc('http://localhost:8888/couchtest/', $_POST)) {    
            header('Location: ' . $_SERVER['REQUEST_URI']); // reload the page
            exit;
        }

…just pass the $_POST (at least for this simple example). Getting interested yet? And how about that reverse proxy between PHP and the db(s) that’s making load balancing transparent?

From what I’ve seen in Dokuwiki, where wiki pages are stored directly, as-is, on the filesystem, there’s a lot to be said for keeping the “raw resources” in a form that makes them easy to identify. Working out the last modification time (caching), replication / mirroring, administration and a whole host of other stuff gets much easier to manage, vs. a relational database where what constitutes a complete “document” may be spread across multiple tables. Of course the downside is stuff like searching, sorting and relations gets harder – enter CouchDb where (if I’ve understood right) you can “compile” tables from the contents of your raw documents using it’s fabric formula language. Assuming the processing done to create the tables is reproducible, replicating databases across systems would then “only” be a matter of copying the raw documents.

The other side of this is what Erlang enables – designed for telephone switches and functional programming language means no squirrels. A worthwhile read to help things click is Functional Programming For The Rest of Us

A functional program is ready for concurrency without any further modifications. You never have to worry about deadlocks and race conditions because you don’t need to use locks! No piece of data in a functional program is modified twice by the same thread, let alone by two different threads. That means you can easily add threads without ever giving conventional problems that plague concurrency applications a second thought!

…and stuff like transactions (apparently) gets easier with functional programming – no awkward state hanging around after you rollback.

Anyway – one to watch I think.

Sponsors