Managing Data Storage with Blockchain and BigchainDB

Chris Ward
Share

There’s no denying that whilst Bitcoin’s future may be hazy right now, the underlying technology it relies upon — the Blockchain — has revolutionized many industries and projects, with more to come.

Ascribe is a fascinating startup that uses Bitcoin’s Blockchain to record a limited quantity of unique references to digital artworks. Thus making them traceable, accountable and (hopefully) more valuable, due to this finite amount of ‘copies’.

Ascribe hit technological problems with this approach, and those problems were primarily due to Bitcoin’s Blockchain itself. Writing everything to it is slow, costly (currently 80c each time) and has a maximum number of daily entries and total capacity for writes. It’s also counter to typical scalable database technologies, adding nodes doesn’t improve performance and has no real query language. This makes scaling a business that relies upon the Bitcoin Blockchain a challenge.

But the Blockchain concept is a strong one and the past years have seen an increasing rise in usage and legitimacy, with even major banks announcing development of technologies inspired by the concept.

Ascribe decided to combine the best of both worlds, taking a proven NoSQL database (RethinkDB) and adding a Blockchain layer on top to add control, asset tracking and an additional level of security.

This combination of technologies is especially interesting to NoSQL database users, as traditionally, few of them support ‘transactions’ that help guarantee a database change has taken place. By writing to an underlying NoSQL database via a Blockchain layer, BigchainDB adds transactional support.

Thanks to the Blockchain layer, BigChainDB also claims to be fully decentralized. Whilst many distributed NoSQL databases claim this, there is often a pseudo master/slave setup.

Installing BigChainDB and Dependencies

There are couple of ways to install BigChainDB. First I tried the Docker images, but ran into some connection issues, finding the Python packages most reliable.

  1. Install RethinkDB, for other Mac users, there is also a Homebrew package available.
  2. Install Python 3.4+.
  3. Install BigChainDB with Pip – sudo pip install bigchaindb
  4. Start RethinkDB with rethinkdb
  5. Start BigChainDB with bigchaindb start which will also configure things for you.
  6. Open the BigChainDB (actually the RethinkDB UI) admin UI at http://SERVER_IP:58080/

Simple Example – Message Allocation and Tracking

One of BigchainDB’s prime use cases (and why Ascribe created it), is for tracking assets, so let’s make a simple example in Python. First run the following commands in your terminal.

pip install bigchaindb
bigchaindb configure
bigchaindb show-config

Create a new file, app.py and add the following:

from bigchaindb import Bigchain
b = Bigchain()
print(b)

This imports the bigchaindb library, creates a new object and connects to it with the settings file just created.

Then run the Python application:

python app.py

You should see something like <bigchaindb.core.Bigchain object at 0x1085b0dd8>, this tells us that everything is well.

Add the following:

from bigchaindb import Bigchain
import time

b = Bigchain()

spuser_priv, spuser_pub = b.generate_keys()
print("User Created")

digital_asset_payload = {'msg': 'This is my special message just for you'}

tx = b.create_transaction(b.me, spuser_pub, None, 'CREATE', payload=digital_asset_payload)
print("Transaction Written")

tx_signed = b.sign_transaction(tx, b.me_private)
b.write_transaction(tx_signed)
print ("Transaction Written to BC, now waiting")

time.sleep(10)

tx_retrieved = b.get_transaction(tx_signed['id'])
print(tx_retrieved)

This creates a user and associated keys for access to the database — remember that extra level of security. Then a payload for writing to the database is created, assigning the required keys, and written.

It will take a few seconds for the new transaction to pass from the Blockchain layer to the database. The code waits for ten seconds and then retrieves and prints the record. You should see something like:

{
  "signature": '304502205",
  "id": "0f442bcf4a42",
  "transaction": {
      "timestamp": "1457104938.430521",
      "data": {
        "hash": "b32779e57",
        "payload": {
          "msg": "This is my special message just for you"
        }
      },
      "operation": "CREATE",
      "current_owner": "hFJKYk2",
      "new_owner": "26pdiQTTx", 
      "input": None
    }
  }
}

You now have one special message that you would like one person to have access to:

...
print("Now to transfer")

spuser2_priv, spuser2_pub = b.generate_keys()
print("Second User Created")

tx_transfer = b.create_transaction(spuser_pub, spuser2_pub, tx_retrieved['id'], 'TRANSFER')
print("Transfer Created")

tx_transfer_signed = b.sign_transaction(tx_transfer, spuser_priv)
b.write_transaction(tx_transfer_signed)
print ("Transaction Written to BC, now waiting")

time.sleep(15)

tx_transfer_retrieved = b.get_transaction(tx_transfer_signed['id'])
print("Transferred")
print(tx_transfer_retrieved)

This creates a second user and then takes the transaction ID of the special message and transfers it to the second user. The Blockchain layer of BigChainDB will prevent users and your code from executing the same action twice. If you tried running the code above again, a double spend exception will be thrown.

This examples shows a small set of the methods that BigChainDB adds to RethinkDB, find the full list here.

HTTP Endpoint

Currently, the only client library available for BigChainDB is Python, more may follow, but in the meantime, a limited HTTP endpoint is available for querying existing transactions:

http://localhost:5000/api/v1/transactions/tx_id

Or write a new transaction with:

http://localhost:5000/api/v1/transactions

Adding the following payload, where operation can be changed to suit the different types of transaction that can be written :

{
  "id": , ""
  "signature": "",
  "transaction": {
    "current_owner": "",
    "data": {
      "hash": "",
      "payload": null
    },
  "input": null,
  "new_owner": "",
  "operation": "",
    "timestamp": ""
  }
}

Part of a Decentralized Future

Ignoring its Blockchain heritage for a moment, BigChainDB offers to supply a lot of functionality missing from current NoSQL and distributed databases. That fact alone may be a reason to try it and may provide a valid business/use case.

For the Blockchain aficionados amongst you, it also completes the puzzle for a complete decentralized application stack. In theory there is now Ethereum for applications, IPFS as a file system and now BigChainDB for data storage. The pieces are in place for a very different way of developing, deploying and maintaining applications, leading to a fascinating future that I would love to hear your opinions on in the comments below.