Skip to content

Load Testing - Break It Before Production Does

Table of Contents


Load Testing Tools : autocannon , k6 , wrk , ab

autocannon - Node-native , scriptable , great for HTTP benchmarking. Use it when you want to script complex scenarios in JavaScript and run them in the same ecosystem as your app

npm install -g autocannon

k6 - the professional choice. Scriptable in JavaScript , excellent reporting , CI/CD integration , large-scale distributed testing. Grafana maintains it and it's the industry standard for serious load testing

wrk - fast C-based HTTP benchmark. Minimal features but insane performance. Good for raw throughput testing without the overhead of scripting

ab (Apache Bench) - the granddaddy of load testing. Pre-installed on many systems , basic functionality. Use it for a quick smell test , not for serious benchmarking

# Quick smell test with ab
ab -n 1000 -c 100 http://localhost:3000/api/users

Metrics : RPS , Latency (p50/p95/p99) , Error Rate

Requests Per Second (RPS) - how many requests your server handles per second. Higher is better but the number alone is misleading without latency context

Latency percentiles:

  • p50 (median) - halfway point. Half your users experience this or worse
  • p95 - 95% of requests are faster than this. The number your boss cares about
  • p99 - 99% of requests are faster than this. The number that keeps you up at night
# autocannon output example

| Stat    | 2.5% | 50%  | 97.5% | 99%   | Avg    | Stdev  | Max (ms) |
|---------|------|------|-------|-------|--------|--------|----------|
| Latency | 12ms | 28ms | 145ms | 210ms | 35.2ms | 28.1ms | 512ms    |

| Stat    | 1%   | 2.5% | 50% | 97.5% | Avg   | Stdev | Min |
|---------|------|------|-----|-------|-------|-------|-----|
| Req/Sec | 125  | 150  | 180 | 210   | 178.5 | 22.3  | 120 |

Requests: 10000 total, 10000 success, 0 errors
Duration: 56 seconds

Error rate - percentage of requests that returned non-2xx/3xx status codes or timed out. Any errors at all under normal load means something is wrong

The relationship between concurrency and latency:

  • Low concurrency (10-50 connections): tests baseline performance
  • Medium concurrency (100-500): reveals queuing and contention
  • High concurrency (1000+): finds bottlenecks in connection handling , thread pools , database pools

Identifying Bottlenecks : CPU , I/O , Memory , Network

CPU bottleneck - the process is compute-bound. Flamegraphs show wide bars. The fix is algorithmic optimization or offloading to worker threads

# Watch CPU during load test
top -p $(pgrep -d',' -f "node")
# or
htop

I/O bottleneck - the process spends most of its time waiting (database queries , file reads , API calls). Event loop lag increases but CPU stays low

const fs = require('fs')
// Blocking I/O in a load test scenario:
const data = fs.readFileSync('/large/file.json') // BLOCKS THE EVENT LOOP

Memory bottleneck - heap grows during load , GC pressure increases , eventually OOM. Watch the heap snapshot comparison

# Watch memory
node -e "setInterval(() => console.log(process.memoryUsage().heapUsed), 1000)"

Network bottleneck - bandwidth saturation , connection limits , TLS handshake overhead. netstat or ss show connection states

# Check connection states
ss -s
# Total: 1234 (estab 890, closed 200, orphaned 0, synrecv 0, timewait 144)

Setting Up autocannon Test Scripts

const autocannon = require('autocannon')
const { PassThrough } = require('stream')

async function runLoadTest() {
  const url = 'http://localhost:3000'
  const results = []

  // Test various endpoints
  const endpoints = [
    { method: 'GET', path: '/api/users' },
    { method: 'GET', path: '/api/users/1' },
    { method: 'POST', path: '/api/login', body: { email: 'test@test.com', password: 'pass123' } },
    { method: 'GET', path: '/api/orders?page=1&limit=50' },
  ]

  for (const endpoint of endpoints) {
    const instance = autocannon({
      url: `${url}${endpoint.path}`,
      connections: 100,
      duration: 30,
      pipelining: 1,
      method: endpoint.method,
      body: endpoint.body ? JSON.stringify(endpoint.body) : undefined,
      headers: {
        'content-type': 'application/json',
        ...(endpoint.requiresAuth ? { authorization: 'Bearer test-token' } : {}),
      },
      // Warm up for 5 seconds
      setupClient: (client) => {
        client.setBody('')
      },
    })

    // Track progress
    autocannon.track(instance, { renderProgressBar: true })

    // Collect results
    instance.on('done', (result) => {
      results.push({
        endpoint: endpoint.path,
        method: endpoint.method,
        latency: {
          avg: result.latency.average,
          p50: result.latency.p50,
          p95: result.latency.p95,
          p99: result.latency.p99,
          max: result.latency.max,
        },
        requests: {
          total: result.requests.total,
          average: result.requests.average,
        },
        errors: result.errors,
        timeouts: result.timeouts,
      })
    })
  }

  return results
}

// Execute
runLoadTest().then((results) => {
  console.log('\n=== LOAD TEST RESULTS ===\n')
  results.forEach((r) => {
    console.log(`\n${r.method} ${r.endpoint}`)
    console.log(`  Requests/sec: ${r.requests.average.toFixed(0)}`)
    console.log(`  Latency (p50/p95/p99): ${r.latency.p50}ms / ${r.latency.p95}ms / ${r.latency.p99}ms`)
    console.log(`  Max latency: ${r.latency.max}ms`)
    console.log(`  Errors: ${r.errors}, Timeouts: ${r.timeouts}`)
  })
})

Test structured ramp-up to find the breaking point:

const autocannon = require('autocannon')

async function findBreakingPoint() {
  const url = 'http://localhost:3000'

  for (const connections of [10, 50, 100, 200, 500, 1000]) {
    console.log(`\n--- Testing with ${connections} concurrent connections ---`)

    const result = await autocannon({
      url,
      connections,
      duration: 15,
    })

    console.log(`RPS: ${result.requests.average.toFixed(0)}`)
    console.log(`p50: ${result.latency.p50}ms | p95: ${result.latency.p95}ms | p99: ${result.latency.p99}ms`)
    console.log(`Errors: ${result.errors} | Timeouts: ${result.timeouts}`)

    if (result.errors > 0 || result.timeouts > 0) {
      console.log(`\n!!! Breaking point found at ${connections} connections !!!`)
      break
    }
  }
}

findBreakingPoint()

Interpreting Results

Good signs:

  • Latency stays flat as concurrency increases (linear scaling)
  • Error rate stays at zero
  • RPS scales proportionally with connections

Bad signs:

  • Latency jumps suddenly at a specific concurrency level (queue backing up)
  • Errors appear as load increases (connection pool exhaustion , database saturation)
  • RPS flatlines or drops after a certain point (bottleneck hit)
  • High p99 but low p50 (some requests are getting stuck - maybe GC pauses or lock contention)
# Good - scales linearly
Connections: 10    RPS: 450    p99: 25ms
Connections: 100   RPS: 4200   p99: 32ms
Connections: 500   RPS: 18500  p99: 48ms

# Bad - hits a wall
Connections: 10    RPS: 450    p99: 25ms
Connections: 100   RPS: 4100   p99: 35ms
Connections: 200   RPS: 4500   p99: 120ms  <-- wall at 200 connections
Connections: 500   RPS: 4800   p99: 890ms  <-- queue backing up

What the wall usually means:

  • Database connection pool exhausted (check pool size)
  • Thread pool exhausted (crypto , compression , filesystem)
  • Event loop blocked (slow synchronous code)
  • Network bandwidth saturated (less likely on localhost)

Security : DoS Patterns , Rate Limiting Implications

Load testing reveals security characteristics you can't see in unit tests

Rate limiting under load: If your rate limiter uses in-memory state , it breaks in clustered apps because each worker has its own state

// BAD - per-process rate limiting breaks under cluster
const rateLimit = require('express-rate-limit')
const limiter = rateLimit({
  windowMs: 60000,
  max: 100,
  // Each worker tracks independently - user can send 400 req/min with 4 workers
})
// GOOD - use shared state (Redis)
const rateLimit = require('express-rate-limit')
const RedisStore = require('rate-limit-redis')
const Redis = require('ioredis')

const limiter = rateLimit({
  store: new RedisStore({
    sendCommand: (...args) => client.call(...args),
  }),
  windowMs: 60000,
  max: 100,
})

DoS pattern detection during load tests:

  • Endpoints without auth that consume significant resources (search , report generation)
  • Routes that trigger expensive database operations (JOINs across large tables)
  • Unbounded query parameters that could cause out-of-memory
  • File upload endpoints without size limits
// Identify DoS-vulnerable patterns during load testing
app.get('/api/search', async (req, res) => {
  const { q, limit } = req.query

  // VULNERABLE under load - no limit on search results
  const results = await db.query(
    `SELECT * FROM products WHERE name ILIKE '%${q}%'`
    // SQL injection AND unbounded results!
  )

  // FIX:
  const results = await db.query(
    `SELECT * FROM products WHERE name ILIKE $1 LIMIT 100`,
    [`%${q}%`]
  )
})

What your load test should also test:

  • Slowloris-style attacks (slow headers , slow body)
  • Connection exhaustion (open connections and never close them)
  • Payload size attacks (send max-size bodies repeatedly)
  • Concurrent authentication attempts (account lockout behavior)
// Test rate limiting behavior under load
test('rate limiter blocks after threshold', async () => {
  const promises = []

  // Send 101 requests - the 101st should be blocked
  for (let i = 0; i < 101; i++) {
    promises.push(
      fetch('http://localhost:3000/api/login', {
        method: 'POST',
        body: JSON.stringify({ email: 'test@test.com', password: 'test' }),
        headers: { 'content-type': 'application/json' },
      })
    )
  }

  const responses = await Promise.all(promises)
  const statusCodes = responses.map(r => r.status)
  const tooManyRequestsCount = statusCodes.filter(s => s === 429).length

  // Assert that at least some requests were rate-limited
  expect(tooManyRequestsCount).toBeGreaterThan(0)
})

prerequisites

perf_02_cluster.md - clustering , PM2 , nginx load balancing

next -> No next section - this is the end of the Testing & Debugging series