Skip to content

MongoDB Indexing

Without indexes , every query in MongoDB scans every document That's a full collection scan and it turns your "fast NoSQL database" into a plodding data furnace the moment your collection hits 100,000 documents and your API response times cross from "instant" to "I have time to make coffee while waiting for the users page to load"

MongoDB indexes work much like Postgres indexes but with different types and behaviors. Single-field indexes speed up queries on one field, compound indexes handle multiple fields in a specific order, text indexes enable full-text search across string fields, and geospatial indexes let you query location data with coordinates, polygons, and distances

Single field indexes

The simplest and most common index - speeds up queries matching a single field and sorts on that field

// Create a single field index
db.users.createIndex({ email: 1 });
// 1 = ascending, -1 = descending
// Ascending/descending doesn't matter for single-field equality queries
// It matters for sort order: if you sort by email DESC, use -1

// This query uses the index:
db.users.find({ email: "omar@example.com" });

// This sort also uses the index:
db.users.find().sort({ email: 1 });

// Monitor index usage
db.users.find({ email: "test@example.com" }).explain("executionStats");
// Look for: "stage": "IXSCAN" (index scan) vs "stage": "COLLSCAN" (collection scan)

Compound indexes

Compound indexes cover queries that filter on multiple fields. The order of fields in the index definition is crucial - MongoDB can use the index for queries that match the prefix of the indexed fields but a query that skips the first field cannot use the rest

// Compound index for queries filtering on role + status + createdAt
db.orders.createIndex({ role: 1, status: 1, createdAt: -1 });

// This query uses the index fully:
db.orders.find({ role: "admin", status: "pending" }).sort({ createdAt: -1 });

// This query uses only the first field of the index:
db.orders.find({ role: "admin" });

// This query cannot use this index (skips 'role'):
db.orders.find({ status: "pending" });
// MongoDB will fall back to COLLSCAN without a matching index

// ESR rule for compound index field order:
// E: Equality fields first (exact match conditions)
// S: Sort fields next (ORDER BY / sort)
// R: Range fields last ($gt, $lt, $gte, $lte)

Compound index ESR pattern

// Query: find orders by status = "pending", sort by createdAt, filter by amount > 100
db.orders.find({
    status: "pending",           // Equality
    amount: { $gt: 100 }         // Range
}).sort({ createdAt: -1 });      // Sort

// Correct index following ESR:
// status (E) -> createdAt (S) -> amount (R)
db.orders.createIndex({ status: 1, createdAt: -1, amount: 1 });

Text indexes

Text indexes support full-text search across string fields with word stemming, stop words, and relevance scoring

// Create a text index on the 'title' and 'content' fields
db.articles.createIndex(
    { title: "text", content: "text" },
    { weights: { title: 10, content: 1 } }  // Title matches are 10x more relevant
);

// Text search query
db.articles.find(
    { $text: { $search: "mongodb security injection" } },
    { score: { $meta: "textScore" } }  // Get relevance score
).sort({ score: { $meta: "textScore" } })  // Sort by relevance
 .limit(10);

// Text search with phrase matching (exact phrase in quotes)
db.articles.find({
    $text: { $search: "\"nosql injection\" prevention" }
});

// Exclude words with minus prefix
db.articles.find({
    $text: { $search: "security -injection" }  // "security" but not "injection"
});

Geospatial indexes

MongoDB has first-class geospatial support with 2dsphere indexes for spherical geometry (lat/lng on Earth) and 2d indexes for flat coordinate planes

// Create a 2dsphere index
db.locations.createIndex({ coordinates: "2dsphere" });

// Insert location documents
db.locations.insertMany([
    {
        name: "Coffee Shop",
        coordinates: { type: "Point", coordinates: [-73.97, 40.77] },  // [lng, lat]
        category: "cafe"
    },
    {
        name: "Library",
        coordinates: { type: "Point", coordinates: [-73.98, 40.75] },
        category: "library"
    }
]);

// Find nearby locations (within 1000 meters)
db.locations.find({
    coordinates: {
        $near: {
            $geometry: { type: "Point", coordinates: [-73.97, 40.76] },
            $maxDistance: 1000,
            $minDistance: 0
        }
    }
});

// Find locations within a polygon
db.locations.find({
    coordinates: {
        $geoWithin: {
            $geometry: {
                type: "Polygon",
                coordinates: [[
                    [-73.99, 40.74],
                    [-73.96, 40.74],
                    [-73.96, 40.78],
                    [-73.99, 40.78],
                    [-73.99, 40.74]
                ]]
            }
        }
    }
});

Explain() - stop guessing about performance

Just like Postgres' EXPLAIN ANALYZE, MongoDB's .explain("executionStats") shows you exactly how a query is executed - which indexes it uses, how many documents it examines, how long each stage takes

// Basic explain - shows winning plan
db.users.find({ email: "test@example.com" }).explain();

// Execution stats - detailed performance data
db.users.find({ email: "test@example.com" }).explain("executionStats");
// Look for:
//   executionStats.executionTimeMillis - total time
//   executionStats.totalDocsExamined - documents scanned
//   executionStats.totalKeysExamined - index entries scanned
//   executionStats.nReturned - documents returned

// All plans execution - see why MongoDB chose its plan
db.users.find({ email: "test@example.com" }).explain("allPlansExecution");

// What to check:
// If totalDocsExamined >> nReturned, your index isn't selective enough
// If executionTimeMillis > 100 for simple queries, add an index
// If IXSCAN isn't in the stage tree, MongoDB is doing a collection scan

Index management and maintenance

// List all indexes on a collection
db.users.getIndexes();

// Create index in background (avoids blocking writes)
db.users.createIndex({ email: 1 }, { background: true });

// Create a unique index (prevents duplicate values)
db.users.createIndex({ email: 1 }, { unique: true });

// Create a sparse index (only indexes documents with the field)
db.users.createIndex({ referrerCode: 1 }, { sparse: true });
// Useful for fields that exist only in some documents

// Create a TTL index (auto-expire documents)
db.sessions.createIndex(
    { createdAt: 1 },
    { expireAfterSeconds: 86400 }  // Delete after 24 hours
);
// Documents where createdAt is older than 24h are automatically removed

// Drop an index
db.users.dropIndex("email_1");  // By name

// Drop multiple indexes
db.users.dropIndexes(["email_1", "username_1"]);

// Analyze index size
db.users.totalSize();       // Total collection size
db.users.stats().totalIndexSize;  // Index size in bytes

Covered queries

A covered query happens when all the fields needed by the query exist within the index itself - MongoDB can answer the query from the index alone without ever touching the actual documents. This is the absolute fastest way to query MongoDB

// Create a covering index
db.users.createIndex(
    { email: 1 },
    { partialFilterExpression: { isActive: true } }  // Only index active users
);

// This query is covered if the index includes 'email' and we only need that field:
db.users.find(
    { email: "test@example.com" },
    { email: 1, _id: 0 }  // Only need this field
);

Security - index-based data inference

Like Postgres, MongoDB's query timing can leak information through indexes. An attacker who can measure response times can determine whether values exist by detecting the difference between index lookup (fast) and collection scan (slow)

// Attacker observes timing differences:
// Fast response → document exists (index hit)
// Slow response → document doesn't exist (collection scan or no match)
const start = Date.now();
await db.collections('users').findOne({ email: attackerInput });
const elapsed = Date.now() - start;

This timing side-channel is how attackers enumerate valid email addresses for credential stuffing attacks. The mitigation: ensure consistent query execution time regardless of result existence, use constant-time comparison for sensitive lookups, and implement rate limiting on authentication endpoints

prerequisites

db_07_mongo_crud.md - indexing only makes sense when you understand the queries the indexes are supporting. If you don't know how $gt and $in and $regex work, you won't know which index types to create


next → db_09_mongo_security.md