Core 07 zlib
Core 07 - Zlib / Compression¶
Basic Idea¶
Network bandwidth costs money and time Compression shrinks data before transport - smaller payload , faster downloads zlib wraps zlib and brotli libraries so you can gzip , deflate , or brotli your data
Core Compression Functions¶
const zlib = require('zlib')
const { pipeline } = require('stream/promises')
// compress a buffer in memory
const input = Buffer.from('hello world '.repeat(1000))
const compressed = zlib.gzipSync(input)
console.log('original:', input.length) // 12000
console.log('gzipped:', compressed.length) // ~50
console.log('ratio:', (compressed.length / input.length * 100).toFixed(1) + '%')
// decompress
const decompressed = zlib.gunzipSync(compressed)
console.log(decompressed.toString().slice(0, 20)) // 'hello world hello '
Sync versions block the event loop - fine for small configs at startup Async versions (zlib.gzip() with callback) or promise wrappers for production
Stream Compression¶
const zlib = require('zlib')
const fs = require('fs')
const { pipeline } = require('stream/promises')
// gzip compress a file via streams
async function gzipFile(input, output) {
await pipeline(
fs.createReadStream(input),
zlib.createGzip(),
fs.createWriteStream(output)
)
console.log('compressed:', output)
}
// decompress
async function gunzipFile(input, output) {
await pipeline(
fs.createReadStream(input),
zlib.createGunzip(),
fs.createWriteStream(output)
)
console.log('decompressed:', output)
}
await gzipFile('access.log', 'access.log.gz')
await gunzipFile('access.log.gz', 'access_restored.log')
Stream compression is where zlib shines - no need to load the entire file into memory You can gzip a 10GB log file with ~64KB of buffer space
Brotli - The Better Gzip¶
const zlib = require('zlib')
// brotli - better ratios , slower compression
const input = Buffer.from('hello world '.repeat(1000))
// gzip baseline
const gzipped = zlib.gzipSync(input)
console.log('gzip ratio:', (gzipped.length / input.length * 100).toFixed(1) + '%')
// brotli - usually 10-20% smaller than gzip
const brotlied = zlib.brotliCompressSync(input)
console.log('brotli ratio:', (brotlied.length / input.length * 100).toFixed(1) + '%')
Brotli is Google's compression algorithm - better ratios than gzip Slower to compress but decompression is fast Browsers support brotli via Content-Encoding: br - use it for static assets
Compression Levels¶
const zlib = require('zlib')
const input = Buffer.from('hello world '.repeat(1000))
// gzip levels 1-9 (default 6)
const fast = zlib.gzipSync(input, { level: 1 })
console.log('level 1:', fast.length, 'bytes')
const max = zlib.gzipSync(input, { level: 9 })
console.log('level 9:', max.length, 'bytes')
// brotli levels 0-11 (default 11)
const brotliFast = zlib.brotliCompressSync(input, {
params: { [zlib.constants.BROTLI_PARAM_QUALITY]: 1 }
})
const brotliMax = zlib.brotliCompressSync(input, {
params: { [zlib.constants.BROTLI_PARAM_QUALITY]: 11 }
})
Level 1 is fast but bigger - for dynamic responses (API , HTML) Level 9 (gzip) / 11 (brotli) is smaller but slow - for static assets you compress once Choose based on your use case: CPU time vs bandwidth savings
Deflate vs Gzip¶
// deflate - raw zlib format (RFC 1950)
const deflated = zlib.deflateSync(input)
const inflated = zlib.inflateSync(deflated)
// gzip - deflate + header + checksum (RFC 1952)
const gzipped = zlib.gzipSync(input)
const gunzipped = zlib.gunzipSync(gzipped)
// deflateRaw - no header at all (RFC 1951)
const rawDeflated = zlib.deflateRawSync(input)
const rawInflated = zlib.inflateRawSync(rawDeflated)
Gzip is deflate with extra wrapping (CRC32 checksum , headers) Deflate is the raw compressed data stream Use gzip for interoperability - HTTP Content-Encoding: gzip everywhere Deflate (Content-Encoding: deflate) is ambiguous and some servers send raw deflate when they mean zlib
Gunzip with Size Limits¶
const zlib = require('zlib')
const { pipeline } = require('stream/promises')
async function safeDecompress(inputPath, outputPath, maxSize = 100 * 1024 * 1024) {
await pipeline(
fs.createReadStream(inputPath),
zlib.createGunzip(),
new LimitTransform(maxSize),
fs.createWriteStream(outputPath)
)
}
class LimitTransform extends Transform {
constructor(maxSize) {
super()
this.total = 0
this.maxSize = maxSize
}
_transform(chunk, encoding, callback) {
this.total += chunk.length
if (this.total > this.maxSize) {
return callback(new Error('decompressed size exceeds limit'))
}
this.push(chunk)
callback()
}
}
Without size limits , a 10KB compressed file could expand to 10GB That's a decompression bomb - simple but devastating DoS Always limit decompressed size in production
Security: Zip Bombs and DoS¶
// classic zip bomb - 42.zip expands to 4.5PB
// brotli/gzip bombs work the same way
// VULNERABLE
const data = await downloadFile(url)
const decompressed = zlib.gunzipSync(data) // might allocate 4.5PB
// process OOM or system swap death
// DEFENSE
const zlib = require('zlib')
const MAX_SIZE = 100 * 1024 * 1024
const decompressor = zlib.createGunzip()
let total = 0
decompressor.on('data', (chunk) => {
total += chunk.length
if (total > MAX_SIZE) {
decompressor.destroy(new Error('decompression bomb detected'))
}
})
decompressor.on('error', (err) => {
console.error('decompression failed:', err.message)
})
// pipe into decompressor only after validation
Decompression bombs bypass size-based firewalls - the compressed payload looks tiny Always stream decompress with size limits Don't trust Content-Length headers - they apply to compressed size , not decompressed
Summary¶
- Gzip for broad compatibility , Brotli for better ratios
createGzip()/createGunzip()for streaming- Higher levels = smaller but slower - level 6 (gzip) is the default for a reason
- Always limit decompressed output - zip bombs are trivial to create
- Deflate is ambiguous - prefer gzip for HTTP transfer encoding
Prerequisites¶
next -> core_08_net.md