Web
Article

7 Simple Speed Solutions for MongoDB

By Craig Buckler

MongoDB Logo

MongoDB is a fast NoSQL database. Unfortunately, it’s not a cure for all your performance woes, and a single complex query can bring your code grinding to a halt. I recently suffered this fate, and it can be difficult to know where to look when your application suddenly becomes unstable. I hope these tips help you avoid the pain I went through!

1. Check Your MongoDB Log


By default, MongoDB records all queries which take longer than 100 milliseconds. Its location is defined in your configuration’s systemLog.path setting, and it’s normally /var/log/mongodb/mongod.log in Debian-based distributions such as Ubuntu.

The log file can be large, so you may want to clear it before profiling. From the mongo command-line console, enter:

use admin;
db.runCommand({ logRotate : 1 });

A new log file will be started and the old data will be available in a file named with the backup date and time. You can delete the backup or move it elsewhere for further analysis.

It can also be useful to watch the log while users are accessing your system. For example:

tail -f /var/log/mongodb/mongod.log

The defaults are reasonable, but you can configure the log level verbosity or modify profiling parameters and change the query time to something other than 100 milliseconds. You could initially set it to one second to catch the worst offending queries, then halve it after every set of successful fixes.

Look out for lines containing ‘COMMAND’ with the execution time in milliseconds at the end. For example:

2016-02-12T11:05:08.161+0000 I COMMAND  
    [conn563] command project.$cmd 
    command: count { 
        count: "test", 
        query: { published: { $ne: false }, 
        country: "uk" } 
    } 
    planSummary: IXSCAN { country: 1 } 
    keyUpdates:0 
    writeConflicts:0 
    numYields:31 
    reslen:44 
    locks: { 
        Global: { 
            acquireCount: { r: 64 } 
        }, 
        MMAPV1Journal: { 
            acquireCount: { r: 32 } 
        }, 
        Database: { 
            acquireCount: { r: 32 } 
        }, 
        Collection: { 
            acquireCount: { R: 32 } 
        } 
    } 403ms

This will help you determine where potential bottlenecks lie.

2. Analyze Your Queries

Like many databases, MongoDB provides an explain facility which reveals how a database operation worked. You can add explain('executionStats') to a query. For example:

db.user.find(
  { country: 'AU', city: 'Melbourne' }
).explain('executionStats');

or append it to the collection:

db.user.explain('executionStats').find(
  { country: 'AU', city: 'Melbourne' }
);

This returns a large JSON result, but there are two primary values to examine:

  • executionStats.nReturned — the number of documents returned, and
  • executionStats.totalDocsExamined — the number of documents scanned to find the result.

If the number of documents examined greatly exceeds the number returned, the query may not be efficient. In the worst cases, MongoDB might have to scan every document in the collection. The query would therefore benefit from the use of an index.

For more information and examples, refer to Analyze Query Performance and db.collection.explain() in the MongoDB manual.

3. Add Appropriate Indexes

NoSQL databases require indexes, just like their relational cousins. An index is built from a set of one or more fields to make querying fast. For example, you could index the country field in a user collection. When a query searches for ‘AU’, MongoDB can find it in the index and reference all matching documents without having to scan the entire user collection.

Indexes are created with createIndex. The most basic command to index the country field in the user collection in ascending order:

db.user.createIndex({ country: 1 });

The majority of your indexes are likely to be single fields, but you can also create compound indexes on two or more fields. For example:

db.user.createIndex({ country: 1, city: 1 });

There are many indexing options, so refer to the MongoDB manual Index Introduction for more information.

4. Be Wary When Sorting

You almost certainly want to sort results, e.g. return all users in ascending country-code order:

db.user.find().sort({ country: 1 });

Sorting works effectively when you have an index defined. Either the single or compound index defined above would be suitable.

If you don’t have an index defined, MongoDB must sort the result itself, and this can be problematic when analyzing a large set of returned documents. The database imposes a 32MB memory limit on sorting operations and, in my experience, 1,000 relatively small documents is enough to push it over the edge. MongoDB won’t necessarily return an error — just an empty set of records.

The sorting limit can strike in unexpected ways. Presume you have an index on the country code like before:

db.user.createIndex({ country: 1 });

A query now sorts on the country and city both in ascending order:

db.user.find().sort({ country: 1, city: 1 });

While the country index can be used, MongoDB must still sort by the secondary city field itself. This is slow, and could exceed the 32MB sorting memory limit. You should therefore create a compound index:

db.user.createIndex({ country: 1, city: 1 });

The sort operation is now fully indexed and will run quickly. You can also sort in reverse country and city order because MongoDB can start at the end of the index and work backward. For example:

db.user.find().sort({ country: -1, city: -1 });

However, problems arise if you attempt to sort in descending country order but ascending city order:

db.user.find().sort({ country: -1, city: 1 });

Our index cannot be used, so you must either disallow non-indexed secondary sorting criteria or create another suitable index:

db.user.createIndex({ country: -1, city: 1 });

Again, this could also be used for queries which reversed the order:

db.user.find().sort({ country: 1, city: -1 });

5. Create Two or More Connection Objects

When building an application, you can increase efficiency with a single persistent database connection object which is reused for all queries and updates.

MongoDB runs all commands in the order it receives them from each client connection. While your application may make asynchronous calls to the database, every command is synchronously queued and must complete before the next can be processed. If you have a complex query which takes ten seconds to run, no one else can interact your application at the same time on the same connection.

Performance can be improved by defining more than one database connection object. For example:

  1. one to handle the majority of fast queries
  2. one to handle slower document inserts and updates
  3. one to handle complex report generation.

Each object is treated as a separate database client and will not delay the processing of others. The application should remain responsive.

6. Set Maximum Execution Times

MongoDB commands run as long as they need. A slowly-executing query can hold up others, and your web application may eventually time out. This can throw various strange instability problems in Node.js programs, which happily continue to wait for an asynchronous callback.

You can specify a time limit in milliseconds using maxTimeMS(): for example, permit 100 milliseconds (one tenth of a second) to query documents in the user collection where the city fields starting with the letter ‘A’:

db.user.find({ city: /^A.+/i }).maxTimeMS(100);

You should set a reasonable maxTimeMS value for any command which is likely to take considerable time. Unfortunately, MongoDB doesn’t allow you to define a global timeout value, and it must be set for individual queries (although some libraries may apply a default automatically).

7. Rebuild Your Indexes

If you’re satisfied your structure is efficient yet queries are still running slowly, you could try rebuilding indexes on each collection. For example, rebuild the user collection indexes from the mongo command line:

db.user.reIndex();

If all else fails, you could consider a database repair to find and fix any problems. This should be considered a last resort when all other options have been exhausted. I’d recommend a full backup using mongodump or another appropriate method before progressing.

May all your MongoDB queries remain fast and efficient! Please let me know if you have further performance tips.

Meet the author
Craig is a freelance UK web consultant who built his first page for IE2.0 in 1995. Since that time he's been advocating standards, accessibility, and best-practice HTML5 techniques. He's written more than 1,000 articles for SitePoint and you can find him @craigbuckler

No Reader comments

Recommended

Learn Coding Online
Learn Web Development

Start learning web development and design for free with SitePoint Premium!

Get the latest in Front-end, once a week, for free.