7 Simple Speed Solutions for MongoDB

Key Takeaways
- Regularly checking the MongoDB log and analyzing queries can help identify potential performance issues, as MongoDB records all queries that take longer than 100 milliseconds by default.
- Indexing, both single and compound, can significantly improve query performance. However, caution must be exercised when sorting results, especially when dealing with large sets of returned documents, as MongoDB imposes a 32MB memory limit on sorting operations.
- Creating multiple connection objects, setting maximum execution times for queries, and occasionally rebuilding indexes can also enhance MongoDB’s performance. However, a database repair should be considered a last resort, only after all other options have been exhausted.
1. Check Your MongoDB Log
By default, MongoDB records all queries which take longer than 100 milliseconds. Its location is defined in your configuration’ssystemLog.path
setting, and it’s normally /var/log/mongodb/mongod.log
in Debian-based distributions such as Ubuntu.
The log file can be large, so you may want to clear it before profiling. From the mongo command-line console, enter:
use admin;
db.runCommand({ logRotate : 1 });
A new log file will be started and the old data will be available in a file named with the backup date and time. You can delete the backup or move it elsewhere for further analysis.
It can also be useful to watch the log while users are accessing your system. For example:
tail -f /var/log/mongodb/mongod.log
The defaults are reasonable, but you can configure the log level verbosity or modify profiling parameters and change the query time to something other than 100 milliseconds. You could initially set it to one second to catch the worst offending queries, then halve it after every set of successful fixes.
Look out for lines containing ‘COMMAND’ with the execution time in milliseconds at the end. For example:
2016-02-12T11:05:08.161+0000 I COMMAND
[conn563] command project.$cmd
command: count {
count: "test",
query: { published: { $ne: false },
country: "uk" }
}
planSummary: IXSCAN { country: 1 }
keyUpdates:0
writeConflicts:0
numYields:31
reslen:44
locks: {
Global: {
acquireCount: { r: 64 }
},
MMAPV1Journal: {
acquireCount: { r: 32 }
},
Database: {
acquireCount: { r: 32 }
},
Collection: {
acquireCount: { R: 32 }
}
} 403ms
This will help you determine where potential bottlenecks lie.
2. Analyze Your Queries
Like many databases, MongoDB provides anexplain
facility which reveals how a database operation worked. You can add explain('executionStats')
to a query. For example:
db.user.find(
{ country: 'AU', city: 'Melbourne' }
).explain('executionStats');
or append it to the collection:
db.user.explain('executionStats').find(
{ country: 'AU', city: 'Melbourne' }
);
This returns a large JSON result, but there are two primary values to examine:
executionStats.nReturned
— the number of documents returned, andexecutionStats.totalDocsExamined
— the number of documents scanned to find the result.
3. Add Appropriate Indexes
NoSQL databases require indexes, just like their relational cousins. An index is built from a set of one or more fields to make querying fast. For example, you could index thecountry
field in a user
collection. When a query searches for ‘AU’, MongoDB can find it in the index and reference all matching documents without having to scan the entire user
collection.
Indexes are created with createIndex
. The most basic command to index the country
field in the user
collection in ascending order:
db.user.createIndex({ country: 1 });
The majority of your indexes are likely to be single fields, but you can also create compound indexes on two or more fields. For example:
db.user.createIndex({ country: 1, city: 1 });
There are many indexing options, so refer to the MongoDB manual Index Introduction for more information.
4. Be Wary When Sorting
You almost certainly want to sort results, e.g. return all users in ascending country-code order:db.user.find().sort({ country: 1 });
Sorting works effectively when you have an index defined. Either the single or compound index defined above would be suitable.
If you don’t have an index defined, MongoDB must sort the result itself, and this can be problematic when analyzing a large set of returned documents. The database imposes a 32MB memory limit on sorting operations and, in my experience, 1,000 relatively small documents is enough to push it over the edge. MongoDB won’t necessarily return an error — just an empty set of records.
The sorting limit can strike in unexpected ways. Presume you have an index on the country
code like before:
db.user.createIndex({ country: 1 });
A query now sorts on the country
and city
both in ascending order:
db.user.find().sort({ country: 1, city: 1 });
While the country
index can be used, MongoDB must still sort by the secondary city
field itself. This is slow, and could exceed the 32MB sorting memory limit. You should therefore create a compound index:
db.user.createIndex({ country: 1, city: 1 });
The sort operation is now fully indexed and will run quickly. You can also sort in reverse country
and city
order because MongoDB can start at the end of the index and work backward. For example:
db.user.find().sort({ country: -1, city: -1 });
However, problems arise if you attempt to sort in descending country order but ascending city order:
db.user.find().sort({ country: -1, city: 1 });
Our index cannot be used, so you must either disallow non-indexed secondary sorting criteria or create another suitable index:
db.user.createIndex({ country: -1, city: 1 });
Again, this could also be used for queries which reversed the order:
db.user.find().sort({ country: 1, city: -1 });
5. Create Two or More Connection Objects
When building an application, you can increase efficiency with a single persistent database connection object which is reused for all queries and updates. MongoDB runs all commands in the order it receives them from each client connection. While your application may make asynchronous calls to the database, every command is synchronously queued and must complete before the next can be processed. If you have a complex query which takes ten seconds to run, no one else can interact your application at the same time on the same connection. Performance can be improved by defining more than one database connection object. For example:- one to handle the majority of fast queries
- one to handle slower document inserts and updates
- one to handle complex report generation.
6. Set Maximum Execution Times
MongoDB commands run as long as they need. A slowly-executing query can hold up others, and your web application may eventually time out. This can throw various strange instability problems in Node.js programs, which happily continue to wait for an asynchronous callback. You can specify a time limit in milliseconds using maxTimeMS(): for example, permit 100 milliseconds (one tenth of a second) to query documents in theuser
collection where the city
fields starting with the letter ‘A’:
db.user.find({ city: /^A.+/i }).maxTimeMS(100);
You should set a reasonable maxTimeMS
value for any command which is likely to take considerable time. Unfortunately, MongoDB doesn’t allow you to define a global timeout value, and it must be set for individual queries (although some libraries may apply a default automatically).
7. Rebuild Your Indexes
If you’re satisfied your structure is efficient yet queries are still running slowly, you could try rebuilding indexes on each collection. For example, rebuild theuser
collection indexes from the mongo command line:
db.user.reIndex();
If all else fails, you could consider a database repair to find and fix any problems. This should be considered a last resort when all other options have been exhausted. I’d recommend a full backup using mongodump or another appropriate method before progressing.
May all your MongoDB queries remain fast and efficient! Please let me know if you have further performance tips.
Frequently Asked Questions (FAQs) about MongoDB Speed Solutions
What are some common reasons for slow performance in MongoDB?
Slow performance in MongoDB can be attributed to several factors. One of the most common reasons is improper indexing. Without the right indexes, MongoDB has to scan every document in a collection to select those that match the query statement. This can be incredibly time-consuming, especially with large databases. Other reasons include insufficient RAM, poor schema design, and network latency.
How can I optimize indexing in MongoDB for better performance?
Indexing is a powerful tool in MongoDB that can significantly improve query performance. To optimize indexing, you should first identify the fields that are most frequently used in your queries and create indexes for them. Also, consider compound indexes if you often use multiple fields in your queries. However, be mindful not to over-index as it can consume more system resources and slow down write operations.
How does sharding improve MongoDB’s performance?
Sharding is a method of distributing data across multiple machines. It allows MongoDB to support deployments with very large data sets and high throughput operations. By distributing the data, sharding can help to overcome hardware limitations and improve query performance. However, it’s important to choose a good shard key to ensure balanced distribution of data.
What is the role of RAM in MongoDB’s performance?
MongoDB stores data in RAM as much as possible to improve read and write speeds. When the size of your data exceeds the available RAM, MongoDB has to read from the disk, which is significantly slower. Therefore, having sufficient RAM is crucial for optimal performance.
How can I monitor the performance of my MongoDB database?
MongoDB provides several tools for monitoring performance, including MongoDB Atlas, which offers real-time performance panel, automated alerts, and metrics for hardware and database performance. You can also use the database profiler to log performance data of all operations or operations that exceed a specified time limit.
How does schema design affect MongoDB’s performance?
In MongoDB, the way you structure your data can significantly impact performance. A well-designed schema can optimize query performance and storage efficiency. For example, embedding related data in a single document can reduce the need for expensive join operations. However, excessive embedding can lead to large documents and increased memory usage.
What is the impact of network latency on MongoDB’s performance?
Network latency can significantly affect MongoDB’s performance, especially in distributed systems like sharded clusters or replica sets. High network latency can slow down data transfer between nodes and lead to slow query responses. Therefore, it’s important to monitor and minimize network latency for optimal performance.
How can I improve write performance in MongoDB?
Write performance in MongoDB can be improved by using bulk operations, which combine many write operations into a single database command. Also, consider using write concern “w:1” for less critical data to improve write speed. However, this may compromise data durability in case of a failure.
How does MongoDB handle large amounts of data?
MongoDB is designed to handle large amounts of data. It uses techniques like sharding to distribute data across multiple machines, allowing it to support very large data sets. Additionally, MongoDB’s flexible schema makes it easy to store and process diverse data types.
Can I use MongoDB for real-time applications?
Yes, MongoDB is well-suited for real-time applications. It offers features like change streams and real-time notifications that allow applications to react to changes in the database in real time. However, for optimal performance, it’s important to properly index your data and monitor your system’s performance regularly.
Craig is a freelance UK web consultant who built his first page for IE2.0 in 1995. Since that time he's been advocating standards, accessibility, and best-practice HTML5 techniques. He's created enterprise specifications, websites and online applications for companies and organisations including the UK Parliament, the European Parliament, the Department of Energy & Climate Change, Microsoft, and more. He's written more than 1,000 articles for SitePoint and you can find him @craigbuckler.
Published in
·Miscellaneous·Patterns & Practices·Performance & Scaling·PHP·Programming·Web·February 17, 2014