Software Security Notes 6 Disassembly
Static Disassembly
Load a binary
find machine instructions in binary
disassemble into human- or machine readable form
Recursive Disassembly
Starts from known entry points
Recursively follows control flow
Used in many reverse-engineering applications
Dynamic Disassembly
Runtime information can resolve indirect calls, distinguishing data vs code
It allows for execution tracers to dump instructions, memory/register contents
Code coverage problem
Code coverage
test suites
Use known test inputs, manually developed, toincrease code coverage.
Trying to cover as much of the program’s functionality as possible.
Ready-made test suites aren’t always available.
Application specific.
Fuzzing
Automatically generate inputs
Favoring executing lots of tests to heavy duty analysis
Generation-based fuzzer
Mutation-based fuzzer
Symbolic Execution
Execute not with concrete values but symblic values
One exection path will generate a set of constraints
Path explosion
Structuring Disassembled Code and Data
Compartmentalizing
- Breaking code into logically connected chunks and make it easier to understand the relationship between chunks.
Revealing control flow - Some structures can reveal control flow. Especially in visual representation, it can make it easier to see how control flows through the code and to get a quick idea of what the code does.
Functions
logically connected codes
function detection
- binaries can be stripped
- code might be scattered
- overlapping code blocks
- Assume functions are contiguous and don’t share code
Based on function signatures:
- well-known patterns and epilogues
- vary depending on the platform, compiler and optimization level used.
Use $call$ for function so easy to locate.
Indirect and tail-call function
Control Flow Graph
CFG organize the inernals of a function
automated analysis, manual analysis
graphic representation
basic blocks: 1st instruction is the only entry point, last instruction is the only exit point
call edges are not part of CFG
Call Graph
show relationship between call sites and functions rather than basic blocks
indirect call not shown in call graph
direct call: call the specific funtion or address
indirect call: call the address stored in a register.