Software Security Notes 1 (Intro, Compilation)
Fundamentals of binary analysis:
- binary format
- disassembly
- code injection
Exploitations techniques:
- stack-based exploitation
- heap exploitation
Vulnerability discovery:
- binary instrumentaion
- fuzzing
- symbolic execution
Binary analysis:
program -> machine code
security gap
what high level program does VS what it really does
Focus on Linux binary (ELF) and x86 assembly.
Static analysis:
Analysing without running it.
platform independent
Less precise. Generally udecidable.
Dynamic analysis:
Runs the binary as it executes
More precise. Access to entire system states
May miss some code. See particular runs. Not all possible.
Challenges in binary analysis:
No symbolic information
No type information
No high-level abstraction
Mixed code and data
Location dependent data and code
Loading and Executing:
Representation in-memory may differ from on-disk representation
Setting up a new process, virtual address space.
Maps an interpreter into virtual memory
Transfer control to interpreter
In Linux, interpreter is a shared library called Id-linux.so
Interpretere:
Loads binary into virtual address space.
Maps required dynamic libraried into virtual address space
Relocation if required
- usually lazy-binding
Transfer control to the entry point
Compilation:
normal code -> machine code
- preprocessing
- compilation
- assembly
- linking
GCC does all by default, but any of the steps can be stopped.
Compiler:
Input preprocessed source files
Output Assembly files
normal code -> assembly
Optimization
-S flag to stop after compiler
Default is AT&T syntax, use -masm=intel to change to Intel syntax
e.g. gcc -S -masm=Intel example.c