Binary Formats¶
Executable file formats structure how the OS loads and runs code
Understanding PE (Windows), ELF (Linux), and Mach-O (macOS) formats is fundamental to reverse engineering because so much information about the binary's capabilities is encoded in its headers and structure
PE Format (Windows)
DOS Header (MZ)
|
DOS Stub (MS-DOS stub program)
|
PE Signature (PE\0\0)
|
File Header (Machine, #Sections, Timestamp, Symbols)
|
Optional Header (Entry Point, Image Base, Subsystem)
|
Section Headers (.text, .data, .rdata, .idata, .rsrc, .reloc)
|
Sections (actual data)
Portable Executable is the format for Windows EXE, DLL, SYS, and SCR files:
- DOS Header - 64 bytes starting with MZ magic (
0x5A4D) - PE Signature -
PE\0\0at offsete_lfanewfrom DOS header - File Header - Machine type, number of sections, characteristics
- Optional Header - Entry point RVA, image base, subsystem, data directories
- Section Headers - Name, virtual size, virtual address, raw size, characteristics
- Data Directories - Import table, export table, resources, relocations, TLS, debug
ELF Format (Linux)
ELF Header
|
Program Headers (segments - for loader)
|
Sections (data)
|
Section Headers (optional - for linkers)
Executable and Linkable Format for Linux and Unix-like systems:
- ELF Header - Magic
\x7fELF, class (32/64-bit) , endianness , OS/ABI , entry point - Program Headers - Describe segments for the loader (PT_LOAD, PT_DYNAMIC, PT_INTERP)
- Section Headers - Describe sections for linkers and debuggers (.text, .data, .bss, .rodata, .plt, .got)
- PLT/GOT - Procedure Linkage Table and Global Offset Table (dynamic linking mechanism)
Mach-O Format (macOS/iOS)
Header (Magic, CPU, Filetype, Commands)
|
Load Commands
|
Segments and Sections
|
Linkedit data
Mach-O uses a different approach:
- Header - Magic
0xFEEDFAC(32-bit) or0xFEEDFACF(64-bit) - Load Commands - Describe layout to loader (LC_SEGMENT, LC_MAIN, LC_LOAD_DYLIB)
- Segments - __TEXT (code) , __DATA (data) , __LINKEDIT (symbols)
- Universal Binary - Multiple architectures in one file (FAT binary)
Key Analysis Points for Each Format
PE: * Imports reveal API usage (network, persistence, evasion) * Sections: abnormal names or permissions flag packing * Resources: embedded executables or configuration * Timestamps: compile time (can be faked)
ELF: * DYNAMIC section shows library dependencies * RUNPATH/RPATH can enable DLL hijacking * Interpreter (ld-linux) can be changed * Symbol visibility reveals intended API
Mach-O: * Code signing (LC_CODE_SIGNATURE) - can be stripped * Encryption (LC_ENCRYPTION_INFO) - App Store binaries encrypted * Objective-C class list (valuable for iOS RE)