Skip to content

Core 07 zlib

Core 07 - Zlib / Compression

Basic Idea

Network bandwidth costs money and time Compression shrinks data before transport - smaller payload , faster downloads zlib wraps zlib and brotli libraries so you can gzip , deflate , or brotli your data

Core Compression Functions

const zlib = require('zlib')
const { pipeline } = require('stream/promises')

// compress a buffer in memory
const input = Buffer.from('hello world '.repeat(1000))
const compressed = zlib.gzipSync(input)
console.log('original:', input.length)   // 12000
console.log('gzipped:', compressed.length) // ~50
console.log('ratio:', (compressed.length / input.length * 100).toFixed(1) + '%')

// decompress
const decompressed = zlib.gunzipSync(compressed)
console.log(decompressed.toString().slice(0, 20)) // 'hello world hello '

Sync versions block the event loop - fine for small configs at startup Async versions (zlib.gzip() with callback) or promise wrappers for production

Stream Compression

const zlib = require('zlib')
const fs = require('fs')
const { pipeline } = require('stream/promises')

// gzip compress a file via streams
async function gzipFile(input, output) {
  await pipeline(
    fs.createReadStream(input),
    zlib.createGzip(),
    fs.createWriteStream(output)
  )
  console.log('compressed:', output)
}

// decompress
async function gunzipFile(input, output) {
  await pipeline(
    fs.createReadStream(input),
    zlib.createGunzip(),
    fs.createWriteStream(output)
  )
  console.log('decompressed:', output)
}

await gzipFile('access.log', 'access.log.gz')
await gunzipFile('access.log.gz', 'access_restored.log')

Stream compression is where zlib shines - no need to load the entire file into memory You can gzip a 10GB log file with ~64KB of buffer space

Brotli - The Better Gzip

const zlib = require('zlib')

// brotli - better ratios , slower compression
const input = Buffer.from('hello world '.repeat(1000))

// gzip baseline
const gzipped = zlib.gzipSync(input)
console.log('gzip ratio:', (gzipped.length / input.length * 100).toFixed(1) + '%')

// brotli - usually 10-20% smaller than gzip
const brotlied = zlib.brotliCompressSync(input)
console.log('brotli ratio:', (brotlied.length / input.length * 100).toFixed(1) + '%')

Brotli is Google's compression algorithm - better ratios than gzip Slower to compress but decompression is fast Browsers support brotli via Content-Encoding: br - use it for static assets

Compression Levels

const zlib = require('zlib')
const input = Buffer.from('hello world '.repeat(1000))

// gzip levels 1-9 (default 6)
const fast = zlib.gzipSync(input, { level: 1 })
console.log('level 1:', fast.length, 'bytes')

const max = zlib.gzipSync(input, { level: 9 })
console.log('level 9:', max.length, 'bytes')

// brotli levels 0-11 (default 11)
const brotliFast = zlib.brotliCompressSync(input, {
  params: { [zlib.constants.BROTLI_PARAM_QUALITY]: 1 }
})
const brotliMax = zlib.brotliCompressSync(input, {
  params: { [zlib.constants.BROTLI_PARAM_QUALITY]: 11 }
})

Level 1 is fast but bigger - for dynamic responses (API , HTML) Level 9 (gzip) / 11 (brotli) is smaller but slow - for static assets you compress once Choose based on your use case: CPU time vs bandwidth savings

Deflate vs Gzip

// deflate - raw zlib format (RFC 1950)
const deflated = zlib.deflateSync(input)
const inflated = zlib.inflateSync(deflated)

// gzip - deflate + header + checksum (RFC 1952)  
const gzipped = zlib.gzipSync(input)
const gunzipped = zlib.gunzipSync(gzipped)

// deflateRaw - no header at all (RFC 1951)
const rawDeflated = zlib.deflateRawSync(input)
const rawInflated = zlib.inflateRawSync(rawDeflated)

Gzip is deflate with extra wrapping (CRC32 checksum , headers) Deflate is the raw compressed data stream Use gzip for interoperability - HTTP Content-Encoding: gzip everywhere Deflate (Content-Encoding: deflate) is ambiguous and some servers send raw deflate when they mean zlib

Gunzip with Size Limits

const zlib = require('zlib')
const { pipeline } = require('stream/promises')

async function safeDecompress(inputPath, outputPath, maxSize = 100 * 1024 * 1024) {
  await pipeline(
    fs.createReadStream(inputPath),
    zlib.createGunzip(),
    new LimitTransform(maxSize),
    fs.createWriteStream(outputPath)
  )
}

class LimitTransform extends Transform {
  constructor(maxSize) {
    super()
    this.total = 0
    this.maxSize = maxSize
  }

  _transform(chunk, encoding, callback) {
    this.total += chunk.length
    if (this.total > this.maxSize) {
      return callback(new Error('decompressed size exceeds limit'))
    }
    this.push(chunk)
    callback()
  }
}

Without size limits , a 10KB compressed file could expand to 10GB That's a decompression bomb - simple but devastating DoS Always limit decompressed size in production

Security: Zip Bombs and DoS

// classic zip bomb - 42.zip expands to 4.5PB
// brotli/gzip bombs work the same way

// VULNERABLE
const data = await downloadFile(url)
const decompressed = zlib.gunzipSync(data) // might allocate 4.5PB
// process OOM or system swap death

// DEFENSE
const zlib = require('zlib')
const MAX_SIZE = 100 * 1024 * 1024

const decompressor = zlib.createGunzip()
let total = 0

decompressor.on('data', (chunk) => {
  total += chunk.length
  if (total > MAX_SIZE) {
    decompressor.destroy(new Error('decompression bomb detected'))
  }
})

decompressor.on('error', (err) => {
  console.error('decompression failed:', err.message)
})

// pipe into decompressor only after validation

Decompression bombs bypass size-based firewalls - the compressed payload looks tiny Always stream decompress with size limits Don't trust Content-Length headers - they apply to compressed size , not decompressed

Summary

  • Gzip for broad compatibility , Brotli for better ratios
  • createGzip() / createGunzip() for streaming
  • Higher levels = smaller but slower - level 6 (gzip) is the default for a reason
  • Always limit decompressed output - zip bombs are trivial to create
  • Deflate is ambiguous - prefer gzip for HTTP transfer encoding

Prerequisites

next -> core_08_net.md