What database to choose for a website similar by functionality to Airbnb?

Hi there,

I am in the process of building website which in many ways is very similar to AirBnb such as map dragging functionality which shows which apartments are available in shown Google maps area, also filtering, etc. and there will be thousands of entries in database.

I was thinking about mongoDB or couchDB since from what I can gather, they should be working faster than querying relational database such as MySQL? What is your opinion on this, I know there is probably no definite answer so a discussion is encouraged :slight_smile:

Could you please tell us a bit more about the data you’re storing and how you intend on manipulating it? Your thread is too generic to discuss any viable solution at the moment.

P.s. CouchBase is the successor of couchDB (it’s a combination of couchDB and Memcached), and so you’d probably want to favour that over couchDB.

Hi tpunt. I didn’t realize to specify the kind of data I will be storing. What I am going to store is pretty basic, it involves title of the listing, coordinates, path to thumbnail image, boolean value of whether listing is active or not and a few other items.

You will be able to search for listings and filter them using various criteria. Since there will be lots of listings, I will display only some of them first, then more can be loaded with click of a button. I need a place to store all these data sets so I can filter through them and display the results according to filters.

So my requirements are a fast way to query my data set and given the result, to show results to the user. But for now I just can’t figure out what kind of solution works best for storing all that data.

It might be worth keeping in mind that back-end is being built on Umbraco and C# but web servers didn’t work so fast for me in the past.

SQL is best for online database and query related to products and services is good to save it.

I wonder what is couchDB is good for then? Or mongoDB? Why ever use these databases?

Speed in my opinion is absolutely crucial because with a map feature like on AirBnb e.g. https://www.airbnb.com/s/Barcelona--Spain , I will be making queries frequently. Also, I don’t know how to confirm this but I heard that SQL might struggle if too many users use a website at once. However, not completely sure about the last statement.

Hello @Brownieable. Without going into too much detail of data models, I’ll try to give you a short reply regarding the questions raised in your thread.

You stated in your first post that you’d be storing thousands of records - that’s really not much in the way of data being stored and any relational database should be able to handle that easily. Document data stores - like MongoDB and CouchDB/CouchBase - won’t give you performance benefits just by using them. Their power comes from the fact that they can be used to shard (split up) your data over multiple servers (a cluster).

The reason why the document data model is better at this than the relational model is because document databases use a structure often referred to as an aggregate (a standalone unit of data). We can therefore shard these units of data over a cluster. The relational model prevents us from doing this because relations (tables) aren’t in units of data, and typically you’ll have foreign keys linking multiple tables together.

To get a better idea of what I’m referring to, take a look at the following picture depicting a mapping of an aggregate (with a key-value or document datastore in mind) to the relational model:

So as you can see, all the data related to a certain aspect of an application (in this case, a students academic information) is encapsulated in a single aggregate (middle) compared to the 5 relations (right). The diagram also shows a common problem often referred to as impedance mismatch, where the in-memory data structures of the application do not fit well with the underlying database model. In the above, we are having to use 5 different relations to persist data related to one aspect of an application, which isn’t good for productivity from a developer’s point of view. The aggregate makes this mismatch easier to deal with by just putting all of this data into a single unit (with the downside being a lot of replicated data).

Thus, if your dataset is large enough, or if you don’t think a single server can handle the amount of requests your application will receive, then a cluster may be the way to go (in which case, it would be easier to use a document-oriented database).

But I don’t want to delve too deep into when to choose a NoSQL database (like MongoDB) over a relational one (or even when to use both for a polyglot persistence setup) since that would require a book (or two). So if you have any more specific questions, than I’d be happy to answer them rather than replying with an essay to a general question.

4 Likes

Thank you for a great answer, @tpunt , it’s nice to hear from someone who is an expert in their field!

I think for the moment then a relational database should work just fine simply because as you said, the data set won’t be as big and amount of users is unlikely to be causing problems too. Besides, I am not in charge of making back-end so the decision of how it will be implemented is not up to me but it’s still nice to learn the differences between relational and noSQL databases.

One thing I am not clear about is whether the aggregate you are talking about in the answer is also in a relational database and instead of having 5 schemas ( tables ) you combine all of them into one? Is there a thumb of rule of knowing when there are too fields ( columns ) ?

Sorry that my questions might be a bit vague, I might as well start reading one of the books you recommended :slight_smile:

You could combine all the relations into one, though your database would have a horrific structure and it would be much slower than if it was properly normalised. So I’d definitely advise you to keep your relations in third normal form for minimal redundancy and easier maintenance.

1 Like