7 Simple Speed Solutions for MongoDB

MongoDB Logo

MongoDB is a fast NoSQL database. Unfortunately, it’s not a cure for all your performance woes, and a single complex query can bring your code grinding to a halt. I recently suffered this fate, and it can be difficult to know where to look when your application suddenly becomes unstable. I hope these tips help you avoid the pain I went through!

Key Takeaways

Regularly checking the MongoDB log and analyzing queries can help identify potential performance issues, as MongoDB records all queries that take longer than 100 milliseconds by default.
Indexing, both single and compound, can significantly improve query performance. However, caution must be exercised when sorting results, especially when dealing with large sets of returned documents, as MongoDB imposes a 32MB memory limit on sorting operations.
Creating multiple connection objects, setting maximum execution times for queries, and occasionally rebuilding indexes can also enhance MongoDB’s performance. However, a database repair should be considered a last resort, only after all other options have been exhausted.

1. Check Your MongoDB Log

By default, MongoDB records all queries which take longer than 100 milliseconds. Its location is defined in your configuration’s systemLog.path setting, and it’s normally /var/log/mongodb/mongod.log in Debian-based distributions such as Ubuntu.

The log file can be large, so you may want to clear it before profiling. From the mongo command-line console, enter:



use admin;

db.runCommand({ logRotate : 1 });

A new log file will be started and the old data will be available in a file named with the backup date and time. You can delete the backup or move it elsewhere for further analysis.

It can also be useful to watch the log while users are accessing your system. For example:


tail -f /var/log/mongodb/mongod.log

The defaults are reasonable, but you can configure the log level verbosity or modify profiling parameters and change the query time to something other than 100 milliseconds. You could initially set it to one second to catch the worst offending queries, then halve it after every set of successful fixes.

Look out for lines containing ‘COMMAND’ with the execution time in milliseconds at the end. For example:

2016-02-12T11:05:08.161+0000 I COMMAND

    [conn563] command project.$cmd

    command: count {

        count: "test",

        query: { published: { $ne: false },

        country: "uk" }

    }

    planSummary: IXSCAN { country: 1 }

    keyUpdates:0

    writeConflicts:0

    numYields:31

    reslen:44

    locks: {

        Global: {

            acquireCount: { r: 64 }

        },

        MMAPV1Journal: {

            acquireCount: { r: 32 }

        },

        Database: {

            acquireCount: { r: 32 }

        },

        Collection: {

            acquireCount: { R: 32 }

        }

    } 403ms

This will help you determine where potential bottlenecks lie.

2. Analyze Your Queries

Like many databases, MongoDB provides an explain facility which reveals how a database operation worked. You can add explain('executionStats') to a query. For example:



db.user.find(

  { country: 'AU', city: 'Melbourne' }

).explain('executionStats');

or append it to the collection:



db.user.explain('executionStats').find(

  { country: 'AU', city: 'Melbourne' }

);

This returns a large JSON result, but there are two primary values to examine:

executionStats.nReturned — the number of documents returned, and
executionStats.totalDocsExamined — the number of documents scanned to find the result.

If the number of documents examined greatly exceeds the number returned, the query may not be efficient. In the worst cases, MongoDB might have to scan every document in the collection. The query would therefore benefit from the use of an index.

For more information and examples, refer to Analyze Query Performance and db.collection.explain() in the MongoDB manual.

3. Add Appropriate Indexes

NoSQL databases require indexes, just like their relational cousins. An index is built from a set of one or more fields to make querying fast. For example, you could index the country field in a user collection. When a query searches for ‘AU’, MongoDB can find it in the index and reference all matching documents without having to scan the entire user collection.

Indexes are created with createIndex. The most basic command to index the country field in the user collection in ascending order:



db.user.createIndex({ country: 1 });

The majority of your indexes are likely to be single fields, but you can also create compound indexes on two or more fields. For example:



db.user.createIndex({ country: 1, city: 1 });

There are many indexing options, so refer to the MongoDB manual Index Introduction for more information.

4. Be Wary When Sorting

You almost certainly want to sort results, e.g. return all users in ascending country-code order:



db.user.find().sort({ country: 1 });

Sorting works effectively when you have an index defined. Either the single or compound index defined above would be suitable.

If you don’t have an index defined, MongoDB must sort the result itself, and this can be problematic when analyzing a large set of returned documents. The database imposes a 32MB memory limit on sorting operations and, in my experience, 1,000 relatively small documents is enough to push it over the edge. MongoDB won’t necessarily return an error — just an empty set of records.

The sorting limit can strike in unexpected ways. Presume you have an index on the country code like before:



db.user.createIndex({ country: 1 });

A query now sorts on the country and city both in ascending order:



db.user.find().sort({ country: 1, city: 1 });

While the country index can be used, MongoDB must still sort by the secondary city field itself. This is slow, and could exceed the 32MB sorting memory limit. You should therefore create a compound index:



db.user.createIndex({ country: 1, city: 1 });

The sort operation is now fully indexed and will run quickly. You can also sort in reverse country and city order because MongoDB can start at the end of the index and work backward. For example:



db.user.find().sort({ country: -1, city: -1 });

However, problems arise if you attempt to sort in descending country order but ascending city order:



db.user.find().sort({ country: -1, city: 1 });

Our index cannot be used, so you must either disallow non-indexed secondary sorting criteria or create another suitable index:



db.user.createIndex({ country: -1, city: 1 });

Again, this could also be used for queries which reversed the order:



db.user.find().sort({ country: 1, city: -1 });

5. Create Two or More Connection Objects

When building an application, you can increase efficiency with a single persistent database connection object which is reused for all queries and updates.

MongoDB runs all commands in the order it receives them from each client connection. While your application may make asynchronous calls to the database, every command is synchronously queued and must complete before the next can be processed. If you have a complex query which takes ten seconds to run, no one else can interact your application at the same time on the same connection.

Performance can be improved by defining more than one database connection object. For example:

one to handle the majority of fast queries
one to handle slower document inserts and updates
one to handle complex report generation.

Each object is treated as a separate database client and will not delay the processing of others. The application should remain responsive.

6. Set Maximum Execution Times

MongoDB commands run as long as they need. A slowly-executing query can hold up others, and your web application may eventually time out. This can throw various strange instability problems in Node.js programs, which happily continue to wait for an asynchronous callback.

You can specify a time limit in milliseconds using maxTimeMS(): for example, permit 100 milliseconds (one tenth of a second) to query documents in the user collection where the city fields starting with the letter ‘A’:



db.user.find({ city: /^A.+/i }).maxTimeMS(100);

You should set a reasonable maxTimeMS value for any command which is likely to take considerable time. Unfortunately, MongoDB doesn’t allow you to define a global timeout value, and it must be set for individual queries (although some libraries may apply a default automatically).

7. Rebuild Your Indexes

If you’re satisfied your structure is efficient yet queries are still running slowly, you could try rebuilding indexes on each collection. For example, rebuild the user collection indexes from the mongo command line:



db.user.reIndex();

If all else fails, you could consider a database repair to find and fix any problems. This should be considered a last resort when all other options have been exhausted. I’d recommend a full backup using mongodump or another appropriate method before progressing.

May all your MongoDB queries remain fast and efficient! Please let me know if you have further performance tips.

Frequently Asked Questions (FAQs) about MongoDB Speed Solutions

What are some common reasons for slow performance in MongoDB?

Slow performance in MongoDB can be attributed to several factors. One of the most common reasons is improper indexing. Without the right indexes, MongoDB has to scan every document in a collection to select those that match the query statement. This can be incredibly time-consuming, especially with large databases. Other reasons include insufficient RAM, poor schema design, and network latency.

How can I optimize indexing in MongoDB for better performance?

Indexing is a powerful tool in MongoDB that can significantly improve query performance. To optimize indexing, you should first identify the fields that are most frequently used in your queries and create indexes for them. Also, consider compound indexes if you often use multiple fields in your queries. However, be mindful not to over-index as it can consume more system resources and slow down write operations.

How does sharding improve MongoDB’s performance?

Sharding is a method of distributing data across multiple machines. It allows MongoDB to support deployments with very large data sets and high throughput operations. By distributing the data, sharding can help to overcome hardware limitations and improve query performance. However, it’s important to choose a good shard key to ensure balanced distribution of data.

What is the role of RAM in MongoDB’s performance?

MongoDB stores data in RAM as much as possible to improve read and write speeds. When the size of your data exceeds the available RAM, MongoDB has to read from the disk, which is significantly slower. Therefore, having sufficient RAM is crucial for optimal performance.

How can I monitor the performance of my MongoDB database?

MongoDB provides several tools for monitoring performance, including MongoDB Atlas, which offers real-time performance panel, automated alerts, and metrics for hardware and database performance. You can also use the database profiler to log performance data of all operations or operations that exceed a specified time limit.

How does schema design affect MongoDB’s performance?

In MongoDB, the way you structure your data can significantly impact performance. A well-designed schema can optimize query performance and storage efficiency. For example, embedding related data in a single document can reduce the need for expensive join operations. However, excessive embedding can lead to large documents and increased memory usage.

What is the impact of network latency on MongoDB’s performance?

Network latency can significantly affect MongoDB’s performance, especially in distributed systems like sharded clusters or replica sets. High network latency can slow down data transfer between nodes and lead to slow query responses. Therefore, it’s important to monitor and minimize network latency for optimal performance.

How can I improve write performance in MongoDB?

Write performance in MongoDB can be improved by using bulk operations, which combine many write operations into a single database command. Also, consider using write concern “w:1” for less critical data to improve write speed. However, this may compromise data durability in case of a failure.

How does MongoDB handle large amounts of data?

MongoDB is designed to handle large amounts of data. It uses techniques like sharding to distribute data across multiple machines, allowing it to support very large data sets. Additionally, MongoDB’s flexible schema makes it easy to store and process diverse data types.

Can I use MongoDB for real-time applications?

Yes, MongoDB is well-suited for real-time applications. It offers features like change streams and real-time notifications that allow applications to react to changes in the database in real time. However, for optimal performance, it’s important to properly index your data and monitor your system’s performance regularly.