Core 05 buffers

Core 05 - Buffers¶

Basic Idea¶

JavaScript strings are UTF-16 and that's useless for binary data Images , encryption , network packets , file formats - none of them are nice Unicode strings Buffer gives you raw memory allocation that works with binary data

What is a Buffer?¶

// Buffer is a Uint8Array subclass
const buf = Buffer.alloc(8)
console.log(buf)
// <Buffer 00 00 00 00 00 00 00 00>
console.log(buf instanceof Uint8Array) // true

Every Buffer is a fixed-size chunk of memory allocated outside the V8 heap That's why they're fast for I/O - no GC overhead for the raw bytes The memory lives in the C++ layer , not in JavaScript's garbage collector

Creating Buffers¶

// safe - zero-filled, slower
const safe = Buffer.alloc(256)
// <Buffer 00 00 00 ...>

// from existing data
const fromString = Buffer.from('hello', 'utf-8')
// <Buffer 68 65 6c 6c 6f>

const fromArray = Buffer.from([0x48, 0x65, 0x6c, 0x6c, 0x6f])
// <Buffer 48 65 6c 6c 6f>

const fromBuffer = Buffer.from(fromString)
// copies the data - independent buffer

// fast but dangerous - uninitialized memory
const unsafe = Buffer.allocUnsafe(1024)
// <Buffer 00 00 00 00 ...> - but might contain leftover data!

Buffer.alloc() is safe - it fills with zeros Buffer.allocUnsafe() is fast but may contain old heap data , passwords , crypto keys Never use allocUnsafe() and then expose the buffer without overwriting it first

Reading and Writing Buffers¶

const buf = Buffer.alloc(16)

// write at various offsets
buf.write('hello', 0, 'utf-8')   // offset 0
buf.write('world', 6, 'utf-8')   // offset 6

// read entire buffer
console.log(buf.toString())       // 'hello world'
console.log(buf.toString('hex'))  // 68656c6c6f00776f726c64

// read specific ranges
console.log(buf.toString('utf-8', 6, 11)) // 'world'

// single bytes
buf[0] = 0x48 // 'H'
console.log(buf[0]) // 72 (decimal)

Buffers support array indexing for byte-level access toString(encoding, start, end) slices without copying memory Writing beyond the buffer's length silently fails - no error , just truncated data

Buffer Encoding¶

const data = 'Hello 0x1RIS'

// utf-8 - default, variable width, handles everything
const utf8Buf = Buffer.from(data, 'utf-8')

// base64 - 33% larger, safe for transport
const b64Buf = Buffer.from(data).toString('base64')
console.log(b64Buf) // 'SGVsbG8gMHgxUklT'

const decoded = Buffer.from(b64Buf, 'base64').toString()
console.log(decoded) // 'Hello 0x1RIS'

// hex - each byte becomes 2 hex chars
const hexBuf = Buffer.from('0x1RIS').toString('hex')
console.log(hexBuf) // 307831524953

// ascii - 7-bit, drops high bytes silently
// latin1 - ISO-8859-1, 8-bit, every byte stays in range 0-255

Base64 and hex encoding are lossless UTF-8 round-trips everything but is larger for binary Choose encoding based on your transport mechanism: base64 for JSON , hex for debugging , utf-8 for text

Slice vs Copy¶

const original = Buffer.from('Hello World')

// slice - NO copy , shares memory
const sliced = original.slice(0, 5)
sliced[0] = 0x48 // modifies original too!
console.log(original.toString()) // 'H\x00llo World'

// copy - independent buffer
const copied = Buffer.alloc(5)
original.copy(copied, 0, 0, 5)
copied[0] = 0x48 // original stays the same

slice() is fast but dangerous - mutations affect the parent If you need a safe subset , use Buffer.from(sliced) or .copy() to create independent memory

Binary Data Manipulation¶

// writing numeric types
const buf = Buffer.alloc(8)
buf.writeUInt32BE(0xDEADBEEF, 0)   // big-endian uint32 at offset 0
buf.writeUInt32LE(0xCAFEBABE, 4)   // little-endian uint32 at offset 4

// reading them back
console.log(buf.readUInt32BE(0).toString(16)) // deadbeef
console.log(buf.readUInt32LE(4).toString(16)) // cafebabe

// int16 , int8 , floats , doubles all exist
buf.writeInt16LE(-32768, 0)
buf.writeDoubleBE(3.14159, 2)

This is how you parse binary protocols (TCP packets , file formats , IMAP responses) Endianness matters: network protocols are big-endian , x86 CPUs are little-endian Wrong endianness = garbage data - no error , just wrong numbers

Security: allocUnsafe() Leaks¶

// DANGER - this leaks memory contents
function processData(input) {
  const buf = Buffer.allocUnsafe(4096)
  // buf contains whatever was in that memory before
  // passwords , crypto keys , session tokens from other requests
  input.copy(buf)
  return buf
}

// SAFE - zero-fill
function processDataSafe(input) {
  const buf = Buffer.alloc(4096)
  input.copy(buf)
  return buf
}

// or overwrite after use
function processDataFast(input) {
  const buf = Buffer.allocUnsafe(4096)
  input.copy(buf)
  // use buf...
  buf.fill(0) // clear before releasing
  return buf
}

allocUnsafe() creates a buffer without clearing the allocated memory If that memory previously held sensitive data from another request or process the new buffer inherits those bytes - congratulations , you just leaked someone else's session

Never use allocUnsafe() in code paths that process sensitive data Never return an allocUnsafe() buffer to a client without zeroing it first

Summary¶

Buffer.alloc() for safety , Buffer.from() for existing data
allocUnsafe() is fast but you must zero it - or it leaks memory contents
slice() shares memory - Buffer.from(slice) for independence
Use correct read/write methods for binary parsing (endianness matters)
Buffer memory lives outside V8 heap - no GC overhead but fixed size

Prerequisites¶

core_04_events.md

next -> core_06_streams.md