Clustering and Scaling - One Thread Is Never Enough¶

Table of Contents¶

Node Single-Thread Limitation
cluster Module : What It Does , When to Use
Load Balancing : Round-Robin vs Custom
processenvNODE_APP_INSTANCE
PM2 for Process Management in Production
OS Process Distribution : nginx + Multiple Node Processes
Security : Shared Port Binding Requires Root , CAP_NET_BIND_SERVICE

Node Single-Thread Limitation¶

Node runs JavaScript on a single thread. That thread handles incoming requests , runs your code , and manages the event loop. If one request takes 500ms of CPU time , every other request waits. That's not a bug - it's the architecture. The event loop is a singular responsibility

You get parallelism from child processes (cluster module) or worker threads (CPU-heavy tasks). The operating system distributes incoming connections across your processes

cluster Module : What It Does , When to Use¶

The cluster module forks the main process into multiple worker processes. Each worker runs its own event loop , has its own memory space , and handles requests independently

const cluster = require('node:cluster')
const http = require('node:http')
const os = require('node:os')

const numCPUs = os.cpus().length

if (cluster.isPrimary) {
  console.log(`Primary ${process.pid} spawning ${numCPUs} workers`)

  // Fork workers - one per CPU core
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork()
  }

  // Handle worker crashes
  cluster.on('exit', (worker, code, signal) => {
    if (code !== 0) {
      console.log(`Worker ${worker.process.pid} died (code: ${code}). Restarting...`)
      cluster.fork()
    }
  })
} else {
  // Workers share the same port
  http.createServer((req, res) => {
    res.writeHead(200)
    res.end(`Handled by worker ${process.pid}\n`)
  }).listen(8000)

  console.log(`Worker ${process.pid} started`)
}

When to cluster: CPU-bound operations , high-traffic APIs , any time one CPU core can't handle the load

When NOT to cluster: If your app is I/O-bound (database queries , API calls) and not saturated. The single thread might be fine. Measure first , cluster second

Load Balancing : Round-Robin vs Custom¶

By default , the primary process listens on the port and distributes connections to workers using round-robin (on Linux/Mac)

Round-robin - each new connection goes to the next worker in sequence. Simple , fair , but doesn't account for worker load

Custom scheduling - access the underlying handle and implement your own distribution:

const cluster = require('node:cluster')

// Force round-robin on Windows (where it's not default)
cluster.schedulingPolicy = cluster.SCHED_RR

// Or set via env variable:
// export NODE_CLUSTER_SCHED_POLICY=rr

The problem with sticky sessions: WebSocket connections must stay on the same worker after the initial handshake. If you're using WebSockets or session affinity , you need a custom approach

// Sticky session balancing (simplified)
const cluster = require('node:cluster')
const hash = require('hash-sum')

if (cluster.isPrimary) {
  const workers = []

  for (let i = 0; i < os.cpus().length; i++) {
    workers.push(cluster.fork())
  }

  // Custom TCP balancer with sticky sessions
  const server = require('net').createServer({ pauseOnConnect: true }, (socket) => {
    // Read the first packet to determine which worker should handle it
    socket.once('data', (data) => {
      // Hash the client IP for sticky session
      const clientIP = socket.remoteAddress
      const workerIndex = hash(clientIP) % workers.length
      workers[workerIndex].send('connection', socket)
    })
  })

  server.listen(8000)
}

Honestly though? Just use nginx for sticky sessions. It's more battle-tested than custom cluster routing

processenvNODE_APP_INSTANCE¶

When cluster.fork() creates workers , each gets a unique NODE_APP_INSTANCE environment variable (index starting from 0)

const cluster = require('node:cluster')

if (cluster.isPrimary) {
  for (let i = 0; i < 4; i++) {
    const worker = cluster.fork({ NODE_APP_INSTANCE: i })

    worker.on('message', (msg) => {
      console.log(`Message from worker ${i}:`, msg)
    })
  }
} else {
  const instanceId = process.env.NODE_APP_INSTANCE

  // Each worker handles different data
  const shard = parseInt(instanceId) + 1

  // Worker 0 handles users 1-25 , worker 1 handles 26-50 , etc
  const userRange = {
    start: shard * 25 - 24,
    end: shard * 25,
  }

  process.send({ pid: process.pid, instanceId, userRange })

  const app = require('express')()
  app.get('/users', (req, res) => {
    res.json({ shard: instanceId, range: userRange })
  })
  app.listen(8000)
}

Use NODE_APP_INSTANCE for sharding data , assigning unique per-worker resources , or coordinating timed jobs among workers

PM2 for Process Management in Production¶

PM2 is the production process manager. It handles clustering , restarts , logging , and monitoring

npm install -g pm2

# Start with 4 instances (or max CPUs)
pm2 start app.js -i max

# Start with specific instance count
pm2 start app.js -i 4

# Name the process
pm2 start app.js -i max --name "my-api"

PM2 ecosystem.config.js:

module.exports = {
  apps: [{
    name: 'api-server',
    script: './src/index.js',
    instances: 'max',
    exec_mode: 'cluster',
    env: {
      NODE_ENV: 'production',
      PORT: 3000,
    },
    env_development: {
      NODE_ENV: 'development',
    },
    max_memory_restart: '500M',
    log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
    error_file: './logs/error.log',
    out_file: './logs/output.log',
    merge_logs: true,
    autorestart: true,
    watch: false,
    max_restarts: 10,
    restart_delay: 1000,
  }],
}

# Start with config
pm2 start ecosystem.config.js

# Save process list for auto-restart on reboot
pm2 save
pm2 startup

# Monitoring
pm2 monit
pm2 status
pm2 logs
pm2 show api-server

# Graceful reload (zero downtime)
pm2 reload all

# Scale up/down without restart
pm2 scale api-server 8

PM2 handles worker crashes , log rotation , and memory-based restarts. Don't write your own clustering script for production - use PM2 or a container orchestration platform like Kubernetes

OS Process Distribution : nginx + Multiple Node Processes¶

In production you'll often run nginx in front of multiple Node processes (or containers). nginx handles TLS termination , static file serving , and load balancing while each Node process focuses on application logic

# /etc/nginx/sites-available/api
upstream node_backend {
    # round-robin (default)
    server 127.0.0.1:3001;
    server 127.0.0.1:3002;
    server 127.0.0.1:3003;
    server 127.0.0.1:3004;

    # For sticky sessions (WebSocket support)
    ip_hash;
}

server {
    listen 443 ssl http2;
    server_name api.example.com;

    ssl_certificate /etc/ssl/certs/example.crt;
    ssl_certificate_key /etc/ssl/private/example.key;

    # Proxy API requests to Node
    location /api/ {
        proxy_pass http://node_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # WebSocket support
    location /ws/ {
        proxy_pass http://node_backend;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_read_timeout 86400;
    }

    # Serve static files directly (zero Node involvement)
    location /static/ {
        root /var/www/myapp;
        expires 30d;
        add_header Cache-Control "public, immutable";
    }

    # Rate limiting
    location /api/auth/ {
        limit_req zone=login burst=5 nodelay;
        proxy_pass http://node_backend;
    }
}

# Run each Node process separately
node src/index.js --port=3001 &
node src/index.js --port=3002 &
node src/index.js --port=3003 &
node src/index.js --port=3004 &

Or with PM2 for each process:

pm2 start src/index.js --name "node-1" -- --port=3001
pm2 start src/index.js --name "node-2" -- --port=3002
pm2 start src/index.js --name "node-3" -- --port=3003
pm2 start src/index.js --name "node-4" -- --port=3004

Security : Shared Port Binding Requires Root , CAP_NET_BIND_SERVICE¶

Only root can bind to ports below 1024 (privileged ports). If you want your Node process on port 80 or 443 without root , you need capability bits

# Give node the ability to bind to privileged ports
sudo setcap cap_net_bind_service=+ep /usr/bin/node

# Verify
getcap /usr/bin/node
# /usr/bin/node = cap_net_bind_service+ep

Alternative - run nginx on 80/443 , proxy to Node on high ports (3000+):

Safer because nginx has a smaller attack surface than your Node app. If someone exploits a vulnerability in your Express routes , they don't have root access by default

server {
    listen 80;
    location / {
        proxy_pass http://127.0.0.1:3000;
    }
}

Security hardening for clustered Node apps:

Never run clusters as root (drop privileges after binding)
Use process.setgid() and process.setuid() after binding ports below 1024
Set worker.rlimits.rss and worker.rlimits.maxfiles in PM2 to limit resource usage per worker
Restrict inter-process communication - workers shouldn't share secrets
Implement health checks that kill unresponsive workers

prerequisites¶

perf_01_profiling.md - profiling , flamegraphs , memory analysis

next -> perf_03_load_testing.md