The Pitfalls of Creating Indexes on MongoDB
Indexing is essential for optimizing your database, but MongoDB's three main indexing methods come with caveats that might trip up those who aren't prepared for them.
Join the DZone community and get the full member experience.
Join For FreeIndexes are a critical part of any database operation. Defining the right indexes can make a huge difference to the performance of your database servers. However, creating indexes in MongoDB has several pitfalls that you need to be aware of for your day to day operations. MongoDB, at a high level, supports three techniques to build indexes on your collections
1. Foreground Index Build
When you build an index in the foreground, it blocks all other operations on the database – on a large collection, this can be several hours. This implies that your database is down for the duration of the index build. Given that this is the default mode of building indexes, it is not surprising that a lot of developers shoot themselves in the foot triggering accidental index builds. There is really no good reason to trigger a foreground index build on a production server (unless you know that the collection has a small amount of data).
2. Background Index Build
As the name implies, the background indexing process builds the index in the background without affecting the availability of your database server. However, it is still a resource intensive operation and you should expect to see performance degradation. Also, since it is happening in the background, it can take a lot longer to build than the foreground indexes. In the previous versions of MongoDB (< 2.6), when you did a background index build on the primary of a replica set, it would run as “foreground” build on the secondary servers, thankfully it is no longer the case – it is background build on all the nodes. If a background index build is interrupted, it will resume as a foreground index build on server restart.
3. Rolling Index Build
The rolling index build process builds the index on only one node at a time. It goes something like this:
Rotate a secondary node out of the replica set (you can do this by changing ports or restarting in standalone mode).
Build the index on this node in the foreground. Once the index is built, rotate the node back into the replica set.
Once the node has caught up to the changes, move onto the next node. If the next node is the primary, you will need to do a rs.stepDown() to make it into a secondary.
Rinse and repeat.
More details of the index build process are in the MongoDB documentation.
Using rolling index builds, you can build an index without any significant performance impact for your application. However there is failover involved — so your application should be able to handle that (which it needs to anyways).
Can you do a rolling index build if you don’t have a replica set? Unfortunately for standalone instances, the only option is a “Background index build.”
The rolling index build is our favored approach to building indexes at ScaleGrid. We even provide a UI and make it easy for you to kick off the whole process from our UI. Our backend will do all the orchestration necessary for the full index build – it will trigger a server by server index build. You just need to point and click!
As always, if you have further questions you can reach out to us at support@scalegrid.io
Published at DZone with permission of Dharshan Rangegowda, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments