Clustering and Scaling - One Thread Is Never Enough¶
Table of Contents¶
- Node Single-Thread Limitation
- cluster Module : What It Does , When to Use
- Load Balancing : Round-Robin vs Custom
- processenvNODE_APP_INSTANCE
- PM2 for Process Management in Production
- OS Process Distribution : nginx + Multiple Node Processes
- Security : Shared Port Binding Requires Root , CAP_NET_BIND_SERVICE
Node Single-Thread Limitation¶
Node runs JavaScript on a single thread. That thread handles incoming requests , runs your code , and manages the event loop. If one request takes 500ms of CPU time , every other request waits. That's not a bug - it's the architecture. The event loop is a singular responsibility
You get parallelism from child processes (cluster module) or worker threads (CPU-heavy tasks). The operating system distributes incoming connections across your processes
cluster Module : What It Does , When to Use¶
The cluster module forks the main process into multiple worker processes. Each worker runs its own event loop , has its own memory space , and handles requests independently
const cluster = require('node:cluster')
const http = require('node:http')
const os = require('node:os')
const numCPUs = os.cpus().length
if (cluster.isPrimary) {
console.log(`Primary ${process.pid} spawning ${numCPUs} workers`)
// Fork workers - one per CPU core
for (let i = 0; i < numCPUs; i++) {
cluster.fork()
}
// Handle worker crashes
cluster.on('exit', (worker, code, signal) => {
if (code !== 0) {
console.log(`Worker ${worker.process.pid} died (code: ${code}). Restarting...`)
cluster.fork()
}
})
} else {
// Workers share the same port
http.createServer((req, res) => {
res.writeHead(200)
res.end(`Handled by worker ${process.pid}\n`)
}).listen(8000)
console.log(`Worker ${process.pid} started`)
}
When to cluster: CPU-bound operations , high-traffic APIs , any time one CPU core can't handle the load
When NOT to cluster: If your app is I/O-bound (database queries , API calls) and not saturated. The single thread might be fine. Measure first , cluster second
Load Balancing : Round-Robin vs Custom¶
By default , the primary process listens on the port and distributes connections to workers using round-robin (on Linux/Mac)
Round-robin - each new connection goes to the next worker in sequence. Simple , fair , but doesn't account for worker load
Custom scheduling - access the underlying handle and implement your own distribution:
const cluster = require('node:cluster')
// Force round-robin on Windows (where it's not default)
cluster.schedulingPolicy = cluster.SCHED_RR
// Or set via env variable:
// export NODE_CLUSTER_SCHED_POLICY=rr
The problem with sticky sessions: WebSocket connections must stay on the same worker after the initial handshake. If you're using WebSockets or session affinity , you need a custom approach
// Sticky session balancing (simplified)
const cluster = require('node:cluster')
const hash = require('hash-sum')
if (cluster.isPrimary) {
const workers = []
for (let i = 0; i < os.cpus().length; i++) {
workers.push(cluster.fork())
}
// Custom TCP balancer with sticky sessions
const server = require('net').createServer({ pauseOnConnect: true }, (socket) => {
// Read the first packet to determine which worker should handle it
socket.once('data', (data) => {
// Hash the client IP for sticky session
const clientIP = socket.remoteAddress
const workerIndex = hash(clientIP) % workers.length
workers[workerIndex].send('connection', socket)
})
})
server.listen(8000)
}
Honestly though? Just use nginx for sticky sessions. It's more battle-tested than custom cluster routing
processenvNODE_APP_INSTANCE¶
When cluster.fork() creates workers , each gets a unique NODE_APP_INSTANCE environment variable (index starting from 0)
const cluster = require('node:cluster')
if (cluster.isPrimary) {
for (let i = 0; i < 4; i++) {
const worker = cluster.fork({ NODE_APP_INSTANCE: i })
worker.on('message', (msg) => {
console.log(`Message from worker ${i}:`, msg)
})
}
} else {
const instanceId = process.env.NODE_APP_INSTANCE
// Each worker handles different data
const shard = parseInt(instanceId) + 1
// Worker 0 handles users 1-25 , worker 1 handles 26-50 , etc
const userRange = {
start: shard * 25 - 24,
end: shard * 25,
}
process.send({ pid: process.pid, instanceId, userRange })
const app = require('express')()
app.get('/users', (req, res) => {
res.json({ shard: instanceId, range: userRange })
})
app.listen(8000)
}
Use NODE_APP_INSTANCE for sharding data , assigning unique per-worker resources , or coordinating timed jobs among workers
PM2 for Process Management in Production¶
PM2 is the production process manager. It handles clustering , restarts , logging , and monitoring
npm install -g pm2
# Start with 4 instances (or max CPUs)
pm2 start app.js -i max
# Start with specific instance count
pm2 start app.js -i 4
# Name the process
pm2 start app.js -i max --name "my-api"
PM2 ecosystem.config.js:
module.exports = {
apps: [{
name: 'api-server',
script: './src/index.js',
instances: 'max',
exec_mode: 'cluster',
env: {
NODE_ENV: 'production',
PORT: 3000,
},
env_development: {
NODE_ENV: 'development',
},
max_memory_restart: '500M',
log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
error_file: './logs/error.log',
out_file: './logs/output.log',
merge_logs: true,
autorestart: true,
watch: false,
max_restarts: 10,
restart_delay: 1000,
}],
}
# Start with config
pm2 start ecosystem.config.js
# Save process list for auto-restart on reboot
pm2 save
pm2 startup
# Monitoring
pm2 monit
pm2 status
pm2 logs
pm2 show api-server
# Graceful reload (zero downtime)
pm2 reload all
# Scale up/down without restart
pm2 scale api-server 8
PM2 handles worker crashes , log rotation , and memory-based restarts. Don't write your own clustering script for production - use PM2 or a container orchestration platform like Kubernetes
OS Process Distribution : nginx + Multiple Node Processes¶
In production you'll often run nginx in front of multiple Node processes (or containers). nginx handles TLS termination , static file serving , and load balancing while each Node process focuses on application logic
# /etc/nginx/sites-available/api
upstream node_backend {
# round-robin (default)
server 127.0.0.1:3001;
server 127.0.0.1:3002;
server 127.0.0.1:3003;
server 127.0.0.1:3004;
# For sticky sessions (WebSocket support)
ip_hash;
}
server {
listen 443 ssl http2;
server_name api.example.com;
ssl_certificate /etc/ssl/certs/example.crt;
ssl_certificate_key /etc/ssl/private/example.key;
# Proxy API requests to Node
location /api/ {
proxy_pass http://node_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# WebSocket support
location /ws/ {
proxy_pass http://node_backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_read_timeout 86400;
}
# Serve static files directly (zero Node involvement)
location /static/ {
root /var/www/myapp;
expires 30d;
add_header Cache-Control "public, immutable";
}
# Rate limiting
location /api/auth/ {
limit_req zone=login burst=5 nodelay;
proxy_pass http://node_backend;
}
}
# Run each Node process separately
node src/index.js --port=3001 &
node src/index.js --port=3002 &
node src/index.js --port=3003 &
node src/index.js --port=3004 &
Or with PM2 for each process:
pm2 start src/index.js --name "node-1" -- --port=3001
pm2 start src/index.js --name "node-2" -- --port=3002
pm2 start src/index.js --name "node-3" -- --port=3003
pm2 start src/index.js --name "node-4" -- --port=3004
Security : Shared Port Binding Requires Root , CAP_NET_BIND_SERVICE¶
Only root can bind to ports below 1024 (privileged ports). If you want your Node process on port 80 or 443 without root , you need capability bits
# Give node the ability to bind to privileged ports
sudo setcap cap_net_bind_service=+ep /usr/bin/node
# Verify
getcap /usr/bin/node
# /usr/bin/node = cap_net_bind_service+ep
Alternative - run nginx on 80/443 , proxy to Node on high ports (3000+):
Safer because nginx has a smaller attack surface than your Node app. If someone exploits a vulnerability in your Express routes , they don't have root access by default
server {
listen 80;
location / {
proxy_pass http://127.0.0.1:3000;
}
}
Security hardening for clustered Node apps:
- Never run clusters as root (drop privileges after binding)
- Use
process.setgid()andprocess.setuid()after binding ports below 1024 - Set
worker.rlimits.rssandworker.rlimits.maxfilesin PM2 to limit resource usage per worker - Restrict inter-process communication - workers shouldn't share secrets
- Implement health checks that kill unresponsive workers
prerequisites¶
perf_01_profiling.md - profiling , flamegraphs , memory analysis
next -> perf_03_load_testing.md