Core 05 buffers
Core 05 - Buffers¶
Basic Idea¶
JavaScript strings are UTF-16 and that's useless for binary data Images , encryption , network packets , file formats - none of them are nice Unicode strings Buffer gives you raw memory allocation that works with binary data
What is a Buffer?¶
// Buffer is a Uint8Array subclass
const buf = Buffer.alloc(8)
console.log(buf)
// <Buffer 00 00 00 00 00 00 00 00>
console.log(buf instanceof Uint8Array) // true
Every Buffer is a fixed-size chunk of memory allocated outside the V8 heap That's why they're fast for I/O - no GC overhead for the raw bytes The memory lives in the C++ layer , not in JavaScript's garbage collector
Creating Buffers¶
// safe - zero-filled, slower
const safe = Buffer.alloc(256)
// <Buffer 00 00 00 ...>
// from existing data
const fromString = Buffer.from('hello', 'utf-8')
// <Buffer 68 65 6c 6c 6f>
const fromArray = Buffer.from([0x48, 0x65, 0x6c, 0x6c, 0x6f])
// <Buffer 48 65 6c 6c 6f>
const fromBuffer = Buffer.from(fromString)
// copies the data - independent buffer
// fast but dangerous - uninitialized memory
const unsafe = Buffer.allocUnsafe(1024)
// <Buffer 00 00 00 00 ...> - but might contain leftover data!
Buffer.alloc() is safe - it fills with zeros Buffer.allocUnsafe() is fast but may contain old heap data , passwords , crypto keys Never use allocUnsafe() and then expose the buffer without overwriting it first
Reading and Writing Buffers¶
const buf = Buffer.alloc(16)
// write at various offsets
buf.write('hello', 0, 'utf-8') // offset 0
buf.write('world', 6, 'utf-8') // offset 6
// read entire buffer
console.log(buf.toString()) // 'hello world'
console.log(buf.toString('hex')) // 68656c6c6f00776f726c64
// read specific ranges
console.log(buf.toString('utf-8', 6, 11)) // 'world'
// single bytes
buf[0] = 0x48 // 'H'
console.log(buf[0]) // 72 (decimal)
Buffers support array indexing for byte-level access toString(encoding, start, end) slices without copying memory Writing beyond the buffer's length silently fails - no error , just truncated data
Buffer Encoding¶
const data = 'Hello 0x1RIS'
// utf-8 - default, variable width, handles everything
const utf8Buf = Buffer.from(data, 'utf-8')
// base64 - 33% larger, safe for transport
const b64Buf = Buffer.from(data).toString('base64')
console.log(b64Buf) // 'SGVsbG8gMHgxUklT'
const decoded = Buffer.from(b64Buf, 'base64').toString()
console.log(decoded) // 'Hello 0x1RIS'
// hex - each byte becomes 2 hex chars
const hexBuf = Buffer.from('0x1RIS').toString('hex')
console.log(hexBuf) // 307831524953
// ascii - 7-bit, drops high bytes silently
// latin1 - ISO-8859-1, 8-bit, every byte stays in range 0-255
Base64 and hex encoding are lossless UTF-8 round-trips everything but is larger for binary Choose encoding based on your transport mechanism: base64 for JSON , hex for debugging , utf-8 for text
Slice vs Copy¶
const original = Buffer.from('Hello World')
// slice - NO copy , shares memory
const sliced = original.slice(0, 5)
sliced[0] = 0x48 // modifies original too!
console.log(original.toString()) // 'H\x00llo World'
// copy - independent buffer
const copied = Buffer.alloc(5)
original.copy(copied, 0, 0, 5)
copied[0] = 0x48 // original stays the same
slice() is fast but dangerous - mutations affect the parent If you need a safe subset , use Buffer.from(sliced) or .copy() to create independent memory
Binary Data Manipulation¶
// writing numeric types
const buf = Buffer.alloc(8)
buf.writeUInt32BE(0xDEADBEEF, 0) // big-endian uint32 at offset 0
buf.writeUInt32LE(0xCAFEBABE, 4) // little-endian uint32 at offset 4
// reading them back
console.log(buf.readUInt32BE(0).toString(16)) // deadbeef
console.log(buf.readUInt32LE(4).toString(16)) // cafebabe
// int16 , int8 , floats , doubles all exist
buf.writeInt16LE(-32768, 0)
buf.writeDoubleBE(3.14159, 2)
This is how you parse binary protocols (TCP packets , file formats , IMAP responses) Endianness matters: network protocols are big-endian , x86 CPUs are little-endian Wrong endianness = garbage data - no error , just wrong numbers
Security: allocUnsafe() Leaks¶
// DANGER - this leaks memory contents
function processData(input) {
const buf = Buffer.allocUnsafe(4096)
// buf contains whatever was in that memory before
// passwords , crypto keys , session tokens from other requests
input.copy(buf)
return buf
}
// SAFE - zero-fill
function processDataSafe(input) {
const buf = Buffer.alloc(4096)
input.copy(buf)
return buf
}
// or overwrite after use
function processDataFast(input) {
const buf = Buffer.allocUnsafe(4096)
input.copy(buf)
// use buf...
buf.fill(0) // clear before releasing
return buf
}
allocUnsafe() creates a buffer without clearing the allocated memory If that memory previously held sensitive data from another request or process the new buffer inherits those bytes - congratulations , you just leaked someone else's session
Never use allocUnsafe() in code paths that process sensitive data Never return an allocUnsafe() buffer to a client without zeroing it first
Summary¶
Buffer.alloc()for safety ,Buffer.from()for existing dataallocUnsafe()is fast but you must zero it - or it leaks memory contentsslice()shares memory -Buffer.from(slice)for independence- Use correct read/write methods for binary parsing (endianness matters)
- Buffer memory lives outside V8 heap - no GC overhead but fixed size
Prerequisites¶
next -> core_06_streams.md