Skip to content

NoSQL Databases

Not every workload needs ACID transactions and rigid schemas NoSQL databases trade consistency guarantees for scale , flexibility , or speed - but tradeoffs bite you if you pick the wrong one for the job

MongoDB with Mongoose

Mongoose adds schema validation on top of MongoDB's flexible document model

const mongoose = require('mongoose')

const userSchema = new mongoose.Schema({
  email: {
    type: String,
    required: true,
    unique: true,
    lowercase: true,
    match: /^\S+@\S+\.\S+$/
  },
  password: { type: String, required: true },
  role: {
    type: String,
    enum: ['user', 'admin', 'moderator'],
    default: 'user'
  },
  profile: {
    firstName: String,
    lastName: String,
    avatar: String
  },
  createdAt: { type: Date, default: Date.now }
})

// Index for fast lookups
userSchema.index({ email: 1 })
userSchema.index({ 'profile.firstName': 'text', 'profile.lastName': 'text' })

const User = mongoose.model('User', userSchema)

// Create
const user = await User.create({
  email: 'test@example.com',
  password: hashedPassword,
  profile: { firstName: 'Mahmoud' }
})

// Read
const found = await User.findOne({ email: 'test@example.com' }).lean()

// Update
await User.findByIdAndUpdate(id, { 'profile.lastName': 'Ali' })

// Delete
await User.findByIdAndDelete(id)

.lean() returns plain JavaScript objects instead of Mongoose documents Without lean , every query creates a full Mongoose document instance - that's expensive for read-heavy endpoints

Queries and Aggregation

// Basic queries
const activeUsers = await User.find({
  role: 'user',
  createdAt: { $gte: new Date('2025-01-01') }
}).sort({ createdAt: -1 }).limit(10)

// Aggregation pipeline - powerful but complex
const stats = await User.aggregate([
  { $match: { role: 'user' } },
  { $group: {
    _id: { $dateToString: { format: '%Y-%m', date: '$createdAt' } },
    count: { $sum: 1 },
    avgAge: { $avg: '$age' }
  }},
  { $sort: { _id: -1 } }
])

// Text search
const results = await User.find(
  { $text: { $search: 'mahmoud ali' } },
  { score: { $meta: 'textScore' } }
).sort({ score: { $meta: 'textScore' } })

Aggregation pipelines look like JSON but they're full programs A poorly written aggregation with no index can scan millions of documents and kill performance

Redis with ioredis

Redis is an in-memory data store - not a primary database

const Redis = require('ioredis')
const redis = new Redis()

// String
await redis.set('user:123', JSON.stringify(userData), 'EX', 3600)
const cached = await redis.get('user:123')

// Hash - store object fields separately
await redis.hset('session:abc', {
  userId: 123,
  role: 'admin',
  ip: '192.168.1.1'
})
const session = await redis.hgetall('session:abc')

// List - ordered collection
await redis.lpush('recent:views:123', 'post:456')
await redis.ltrim('recent:views:123', 0, 9)  // keep last 10

// Set - unique members
await redis.sadd('online:users', 'user:123', 'user:456')
const onlineCount = await redis.scard('online:users')

// Sorted Set - leaderboard / score-based
await redis.zadd('leaderboard', 100, 'user:123', 85, 'user:456')
const top10 = await redis.zrevrange('leaderboard', 0, 9, 'WITHSCORES')

Redis operations are atomic - that's why it's perfect for counters and rate limiting But atomic doesn't mean durable - Redis without persistence loses all data on restart

Pub/Sub with Redis

// Publisher
const Redis = require('ioredis')
const publisher = new Redis()

async function notifyUser(userId, event) {
  await publisher.publish('user:notifications', JSON.stringify({
    userId,
    event,
    timestamp: Date.now()
  }))
}

// Subscriber (separate connection)
const subscriber = new Redis()
subscriber.subscribe('user:notifications')

subscriber.on('message', (channel, message) => {
  const data = JSON.parse(message)
  console.log('Notification for user:', data.userId)
  // Send WebSocket message , email , push notification
})

Redis pub/sub has no message persistence If the subscriber disconnects , it misses every message sent while it was gone - use Redis Streams for reliable messaging

Caching with Redis

async function getUser(id) {
  const cacheKey = `user:${id}`

  // Cache-aside: check cache first
  const cached = await redis.get(cacheKey)
  if (cached) {
    return JSON.parse(cached)
  }

  // Miss - query database
  const user = await User.findById(id)

  // Write to cache
  await redis.set(cacheKey, JSON.stringify(user), 'EX', 3600)

  return user
}

// Invalidate on update
async function updateUser(id, data) {
  await User.findByIdAndUpdate(id, data)
  await redis.del(`user:${id}`)  // invalidate cache
}

Cache invalidation is one of the hard problems in computer science Just delete the cache key on updates - don't try to keep it in sync

Elasticsearch Basics

const { Client } = require('@elastic/elasticsearch')
const es = new Client({ node: 'http://localhost:9200' })

// Index a document
await es.index({
  index: 'logs',
  body: {
    timestamp: new Date(),
    level: 'ERROR',
    message: 'Database connection failed',
    service: 'api-server'
  }
})

// Search
const { hits } = await es.search({
  index: 'logs',
  body: {
    query: {
      bool: {
        must: [
          { match: { level: 'ERROR' } },
          { range: { timestamp: { gte: 'now-1h' } } }
        ]
      }
    },
    aggs: {
      by_service: { terms: { field: 'service.keyword' } }
    }
  }
})

Elasticsearch is great for full-text search and log aggregation Never use it as your primary database - it has no transactions and weak consistency guarantees

NoSQL Injection

MongoDB is vulnerable to injection when you use $where or pass unvalidated query operators

// VULNERABLE - $where injection
User.find({ $where: `this.email === '${email}'` })

// If email = "' || true || '"
// Query becomes: $where: "this.email === '' || true || ''"
// Returns ALL users

// VULNERABLE - query parameter injection
User.find({ email: userInput })

// If userInput = { $gt: '' }
// Returns first user where email > '' (basically every user)

Mongoose types are safer because they enforce schema types But raw MongoDB queries with user input are a script-kiddie's playground - validate everything

// SAFE - use specific query methods
User.findOne({ email: req.body.email })

// SAFE - validate input type
if (typeof req.body.email !== 'string') {
  throw new Error('email must be a string')
}

// SAFE - use Mongoose's typed schema
// With a schema , email is always a string - no injection possible

Prevent NoSQL injection by: * Never passing user input directly into $where, $gte, $regex, $gt * Validating that query parameters are the expected type * Using Mongoose schemas instead of raw MongoDB driver * Sanitizing user input before building queries

Prerequisites

  • db_02_sql.md - understand SQL before comparing NoSQL tradeoffs

next -> db_04_orms.md