Beyond Rails Abstractions: A Dive into Database Internals

David Bush
David Bush
a dive into database internals

Abstractions are wonderful things. Take a look at Rails: we can achieve huge amounts of functionality with comparatively few lines of code written. I don’t necessarily need to know a great deal about how my database is implemented in order to be up and running quickly, for example.

The downside of this level of insulation from core functionality is that developers don’t learn all of the things they perhaps should. Only last week I was able to troubleshoot a slow query by removing the index because someone along the line didn’t understand cardinality or how an index worked. Could you implement a database yourself? Let’s dig into another piece of technology that we take for granted!

How do Indexes Work?

We expect relatively fast look up times when querying a database. If you were implementing a function to pull in every value between two numbers, you might be tempted to check each value to see if it met the criterion, but this would be prohibitively inefficient for large datasets, as it performs in linear time (O(n)). What you need is a way to store records in a specified order to facilitate quicker access times. A structure like a Binary Search Tree (BST) does exactly this, by ensuring that any given record is greater than all the records in its left subtree, but lower than all items in its right subtree. This brings our search time down to a much more respectable O(log n), and is the backbone of a database index.

Where a BST lets us down is that we seldom require a specific record. Rather, we’re more likely to want all records matching a particular criterion. A naive solution would be to check each record again, which brings us right back up to linear time, or O(n). What we need is a way to store all records in order, traverse that list and then stop when a condition is met (in this case, the upper bound of our query). A data structure well suited to this task is the B+ Tree, which brings access times down to O(m + log n).

While researching this article, I was struck by the inefficiency of this approach. Ostensibily, all that’s required is fast indexing of an ordered list — a task well suited to criminally under-used and little-known data structure, the Skip List. It turns out that, although B+ Trees typically requires more CPU work than equivalent operations in other data structures, they excel at requiring far fewer transfers from disk to memory, which is a natural bottleneck. This Google Blog post provides some context as to why this is. Additionally, B+ Trees are also worst-case efficient — a huge factor when datasets are larger.

Why Is Using Too Many Indexes a Bad Idea?

Let’s imagine removing a row in our database. This means we’d also need to remove the corresponding node in the B+ Tree. You can’t just remove the node, because we’re relying on the ordered nature of the data structure, which means we need to retain the order in addition to ensuring that the overall height of the tree is as low as it can be. To combat this, our database engine will leverage smart insertion and deletion methods, but these will take logarithmic (O(log n)) time. Crucially, any field you apply an index to has to rebalance the tree each time you perform an insert or delete operation.

A field with few possible options (such as boolean, trilean, etc.) is said to have low cardinality. Here is a great IBM article that warns against the pitfalls of indexing fields with low cardinality, but the gist of it is that you should avoid doing it. Anecdotally however, I’ve reduced queries by 80% before, simply by indexing a True/False column, so as always, profile profile profile!

Query Parser

Let’s take a standard query and follow its journey:

SELECT id, email, admin, credits FROM users WHERE credits > 2 + 3;

In a fashion not dissimilar to a compiler, a parser changes your text into a parse tree:

>>> SELECT email, admin, credits FROM users WHERE credits > 2 + 3                                                         
LOG:  parse tree:
  :commandType 1
  :querySource 0
  :canSetTag true
  :utilityStmt <>
  :resultRelation 0
  :hasAggs false
  :hasWindowFuncs false
  :hasSubLinks false
  :hasDistinctOn false
  :hasRecursive false
  :hasModifyingCTE false
  :hasForUpdate false
  :hasRowSecurity false
  :cteList <>
  :rtable (
    :alias <>
      :aliasname users
      :colnames ("id" "created_at" "updated_at" "email" "encrypted_password
      " "confirmation_token" "remember_token" "admin" "" "credits")
    :rtekind 0
    :relid 27043
    :relkind r
    :tablesample <>
    :lateral false
    :inh true
    :inFromCl true
    :requiredPerms 2
    :checkAsUser 0
    :selectedCols (b 12 16 18)
    :insertedCols (b)
    :updatedCols (b)
    :securityQuals <>
  Time: 0.001s

For those of you playing along at home, I’m using the excellent pgcli with Postgres and the following two settings:

SET client_min_messages=debug1;
SET debug_print_parse=on;

Back to the query parser. During this process, it checks that your input is legal. Keywords in the wrong order? Typo? It’s the query parser that kicks an error back to you (note the WHRE in the following example):

SELECT email, admin, credits FROM users WHRE credits > 2 + 3;
syntax error at or near "credits"

Next, the query parser checks the database metadata: does your table exist? Does the column exist? Is the operation you’re about to perform possible? Do you have authorization via the database permission system?

SELECT email, admin, credits FROM users WHERE credits > 'Ostrich';
invalid input syntax for integer: "Ostrich"

If everything’s looking good, then the query parser passes the tree along to the next step.

Query Rewriter

With our tree successfully built, it’s time to optimize it. The query rewriter takes our tree and applies a list of rules to it. For example, if you’ve put a DISTINCT on a column that already has a unique constraint, then the DISTINCT is discarded. Integer calculations are replaced with a constant (2 + 3 in our example is replaced with a 5 here). There’s a host of other rules here, but the endgame is to ensure that clock cycles incurred by query execution are kept to a minimum.

The end result is a query plan, and it is, theoretically, a heavily optimized and efficient SQL query. But how does the DB engine know which are the most efficient joins and so forth? For that, we need to take a quick detour …


A good database engine will collect statistics. These will be things like the number of records in a table, how many distinct values are in a column, minimum/maximum values, etc. This is for the query optimization process, and it’s the reason your queries are quick. For example, if you’re trying to join a table with a million rows to one with a hundred, it’s going to be drastically quicker to join on the one with fewer records. It actually goes into much more depth than this, computing histograms containing information like quantiles and standard deviation, but you get the idea.

There’s nothing less conducive to quick queries than erroneous statistics. If the query optimizer thinks there are 200 records when there are actually 200 million, and has computed histograms and statistics on this basis, this can be the difference between minutes and hours for query execution. For this reason, it’s important to trigger statistical collection periodically.

Query Executor

Finally, it’s time to execute our query! The query manager checks it has sufficient resources (memory and CPU), and if permitted, begins execution. First it will check an in-memory cache of data (memory is much quicker than disk, remember) and trigger the cache manager to prefetch the next chunk of data required, as indicated by the query plan. It will discard data that’s no longer required, and you can check if it’s doing a good job or not by looking at something called the buffer/cache-hit ratio. Around 99% is what you’re aiming for.

If this were a write/update/delete operation, your Transaction Manager would kick in here to protect your data integrity, but since we’re just doing a simple read, all that’s left is for the database’s client manager to return a beautifully paged response to the client.

Wrapping Up

There are many great resources for better understanding databases, and there’s a lot I haven’t covered here. But this should hopefully provide enough of an overview to help you begin to know what to look for when profiling. At the very least, you might think twice before blindly adding an index!

Frequently Asked Questions (FAQs) about Database Internals

What are the basic components of a database system?

A database system is composed of several key components. These include the database engine, which is responsible for managing the core operations such as data storage, query processing, and transaction management. The database schema defines the logical structure of the database, including tables, views, indexes, and constraints. The query optimizer is responsible for determining the most efficient way to execute a given query. Lastly, the transaction manager ensures that all database operations maintain consistency and integrity.

How does a database engine work?

The database engine is the core component of a database system. It is responsible for managing data storage, query processing, and transaction management. The engine uses a variety of algorithms and data structures to efficiently store and retrieve data. It also ensures that all transactions are processed in a consistent and reliable manner, even in the event of system failures.

What is the role of a query optimizer in a database system?

The query optimizer is a crucial component of a database system. It is responsible for determining the most efficient way to execute a given query. The optimizer considers various factors such as the size of the data, the structure of the database, and the available system resources. It then generates a query execution plan, which is a sequence of operations that will be performed to retrieve the requested data.

How does a database maintain consistency and integrity?

Databases maintain consistency and integrity through a process known as transaction management. A transaction is a sequence of operations that are performed as a single unit of work. The database ensures that all transactions are atomic, consistent, isolated, and durable (ACID). This means that each transaction is completed in its entirety or not at all, the database remains in a consistent state before and after each transaction, each transaction is isolated from others, and the results of a transaction are permanent.

What are the different types of database systems?

There are several types of database systems, each designed to meet specific needs. Relational databases use tables to store data and are ideal for structured data. NoSQL databases are designed for unstructured data and can handle large volumes of data. Distributed databases are spread across multiple physical locations and are used for large-scale applications. In-memory databases store data in memory for fast access, while graph databases are used for data with complex relationships.

How does a distributed database system work?

A distributed database system is a database that is spread across multiple physical locations. The data is partitioned across different nodes, each of which can process queries independently. This allows for high availability and scalability, as the system can continue to function even if one node fails. The system also uses replication to ensure data consistency across all nodes.

What is the difference between a database and a database management system (DBMS)?

A database is a collection of related data, while a database management system (DBMS) is the software that manages and interacts with the database. The DBMS provides a way for users to create, retrieve, update, and manage data in the database. It also ensures data integrity, security, and consistency.

What are the key features of a good database system?

A good database system should be reliable, scalable, and efficient. It should provide a high level of data integrity and security. The system should also be able to handle large volumes of data and support complex queries. Additionally, it should have a user-friendly interface and provide tools for data analysis and reporting.

How does a database handle concurrent transactions?

Databases handle concurrent transactions through a process known as concurrency control. This ensures that multiple transactions can be executed simultaneously without interfering with each other. The database uses locking and other mechanisms to prevent conflicts and ensure data consistency.

What is the role of indexing in a database system?

Indexing is a technique used in databases to speed up data retrieval. An index is a data structure that allows the database engine to find data without having to scan every row in a table. By creating an index on a column, the database can quickly locate the data for a given value. This significantly improves the performance of queries that involve that column.