7 Simple Speed Solutions for MongoDB
Programming
MongoDB is a fast NoSQL database. Unfortunately, it’s not a cure for all your performance woes, and a single complex query can bring your code grinding to a halt. I recently suffered this fate, and it can be difficult to know where to look when your application suddenly becomes unstable. I hope these tips help you avoid the pain I went through!
1. Check Your MongoDB Log
By default, MongoDB records all queries which take longer than 100 milliseconds. Its location is defined in your configuration’s
systemLog.path setting, and it’s normally
/var/log/mongodb/mongod.log in Debian-based distributions such as Ubuntu.
The log file can be large, so you may want to clear it before profiling. From the mongo command-line console, enter:
use admin;
db.runCommand({ logRotate : 1 });
A new log file will be started and the old data will be available in a file named with the backup date and time. You can delete the backup or move it elsewhere for further analysis.
It can also be useful to watch the log while users are accessing your system. For example:
tail -f /var/log/mongodb/mongod.log
The defaults are reasonable, but you can configure the log level verbosity or modify profiling parameters and change the query time to something other than 100 milliseconds. You could initially set it to one second to catch the worst offending queries, then halve it after every set of successful fixes.
Look out for lines containing ‘COMMAND’ with the execution time in milliseconds at the end. For example:
2016-02-12T11:05:08.161+0000 I COMMAND
[conn563] command project.$cmd
command: count {
count: "test",
query: { published: { $ne: false },
country: "uk" }
}
planSummary: IXSCAN { country: 1 }
keyUpdates:0
writeConflicts:0
numYields:31
reslen:44
locks: {
Global: {
acquireCount: { r: 64 }
},
MMAPV1Journal: {
acquireCount: { r: 32 }
},
Database: {
acquireCount: { r: 32 }
},
Collection: {
acquireCount: { R: 32 }
}
} 403ms
This will help you determine where potential bottlenecks lie.
2. Analyze Your Queries
Like many databases, MongoDB provides an
explain facility which reveals how a database operation worked. You can add
explain('executionStats') to a query. For example:
db.user.find(
{ country: 'AU', city: 'Melbourne' }
).explain('executionStats');
or append it to the collection:
db.user.explain('executionStats').find(
{ country: 'AU', city: 'Melbourne' }
);
This returns a large JSON result, but there are two primary values to examine:
executionStats.nReturned— the number of documents returned, and
executionStats.totalDocsExamined— the number of documents scanned to find the result.
If the number of documents examined greatly exceeds the number returned, the query may not be efficient. In the worst cases, MongoDB might have to scan every document in the collection. The query would therefore benefit from the use of an index.
For more information and examples, refer to Analyze Query Performance and db.collection.explain() in the MongoDB manual.
3. Add Appropriate Indexes
NoSQL databases require indexes, just like their relational cousins. An index is built from a set of one or more fields to make querying fast. For example, you could index the
country field in a
user collection. When a query searches for ‘AU’, MongoDB can find it in the index and reference all matching documents without having to scan the entire
user collection.
Indexes are created with
createIndex. The most basic command to index the
country field in the
user collection in ascending order:
db.user.createIndex({ country: 1 });
The majority of your indexes are likely to be single fields, but you can also create compound indexes on two or more fields. For example:
db.user.createIndex({ country: 1, city: 1 });
There are many indexing options, so refer to the MongoDB manual Index Introduction for more information.
4. Be Wary When Sorting
You almost certainly want to sort results, e.g. return all users in ascending country-code order:
db.user.find().sort({ country: 1 });
Sorting works effectively when you have an index defined. Either the single or compound index defined above would be suitable.
If you don’t have an index defined, MongoDB must sort the result itself, and this can be problematic when analyzing a large set of returned documents. The database imposes a 32MB memory limit on sorting operations and, in my experience, 1,000 relatively small documents is enough to push it over the edge. MongoDB won’t necessarily return an error — just an empty set of records.
The sorting limit can strike in unexpected ways. Presume you have an index on the
country code like before:
db.user.createIndex({ country: 1 });
A query now sorts on the
country and
city both in ascending order:
db.user.find().sort({ country: 1, city: 1 });
While the
country index can be used, MongoDB must still sort by the secondary
city field itself. This is slow, and could exceed the 32MB sorting memory limit. You should therefore create a compound index:
db.user.createIndex({ country: 1, city: 1 });
The sort operation is now fully indexed and will run quickly. You can also sort in reverse
country and
city order because MongoDB can start at the end of the index and work backward. For example:
db.user.find().sort({ country: -1, city: -1 });
However, problems arise if you attempt to sort in descending country order but ascending city order:
db.user.find().sort({ country: -1, city: 1 });
Our index cannot be used, so you must either disallow non-indexed secondary sorting criteria or create another suitable index:
db.user.createIndex({ country: -1, city: 1 });
Again, this could also be used for queries which reversed the order:
db.user.find().sort({ country: 1, city: -1 });
5. Create Two or More Connection Objects
When building an application, you can increase efficiency with a single persistent database connection object which is reused for all queries and updates.
MongoDB runs all commands in the order it receives them from each client connection. While your application may make asynchronous calls to the database, every command is synchronously queued and must complete before the next can be processed. If you have a complex query which takes ten seconds to run, no one else can interact your application at the same time on the same connection.
Performance can be improved by defining more than one database connection object. For example:
- one to handle the majority of fast queries
- one to handle slower document inserts and updates
- one to handle complex report generation.
Each object is treated as a separate database client and will not delay the processing of others. The application should remain responsive.
6. Set Maximum Execution Times
MongoDB commands run as long as they need. A slowly-executing query can hold up others, and your web application may eventually time out. This can throw various strange instability problems in Node.js programs, which happily continue to wait for an asynchronous callback.
You can specify a time limit in milliseconds using maxTimeMS(): for example, permit 100 milliseconds (one tenth of a second) to query documents in the
user collection where the
city fields starting with the letter ‘A’:
db.user.find({ city: /^A.+/i }).maxTimeMS(100);
You should set a reasonable
maxTimeMS value for any command which is likely to take considerable time. Unfortunately, MongoDB doesn’t allow you to define a global timeout value, and it must be set for individual queries (although some libraries may apply a default automatically).
7. Rebuild Your Indexes
If you’re satisfied your structure is efficient yet queries are still running slowly, you could try rebuilding indexes on each collection. For example, rebuild the
user collection indexes from the mongo command line:
db.user.reIndex();
If all else fails, you could consider a database repair to find and fix any problems. This should be considered a last resort when all other options have been exhausted. I’d recommend a full backup using mongodump or another appropriate method before progressing.
May all your MongoDB queries remain fast and efficient! Please let me know if you have further performance tips.
