Fundamentals of binary analysis:

  • binary format
  • disassembly
  • code injection

Exploitations techniques:

  • stack-based exploitation
  • heap exploitation

Vulnerability discovery:

  • binary instrumentaion
  • fuzzing
  • symbolic execution




Binary analysis:

program -> machine code
security gap
what high level program does VS what it really does

Focus on Linux binary (ELF) and x86 assembly.

Static analysis:

Analysing without running it.
platform independent
Less precise. Generally udecidable.

Dynamic analysis:

Runs the binary as it executes
More precise. Access to entire system states
May miss some code. See particular runs. Not all possible.

Challenges in binary analysis:

No symbolic information
No type information
No high-level abstraction
Mixed code and data
Location dependent data and code

Loading and Executing:

Representation in-memory may differ from on-disk representation
Setting up a new process, virtual address space.
Maps an interpreter into virtual memory
Transfer control to interpreter
In Linux, interpreter is a shared library called Id-linux.so

Interpretere:

Loads binary into virtual address space.
Maps required dynamic libraried into virtual address space
Relocation if required

  • usually lazy-binding
    Transfer control to the entry point




Compilation:

normal code -> machine code

  • preprocessing
  • compilation
  • assembly
  • linking

GCC does all by default, but any of the steps can be stopped.

Compiler:

Input preprocessed source files
Output Assembly files
normal code -> assembly
Optimization
-S flag to stop after compiler
Default is AT&T syntax, use -masm=intel to change to Intel syntax

e.g. gcc -S -masm=Intel example.c