Here are some tools for program analysis. I strongly recommend you to implement algorithms on them.
LLVM
: theClang Static Analyzer
is a awesome tool to analyze C/C++ from AST level. AndLLVM Pass
provides a more low-level IR to analyze.Soot
: tools for Java program anlysis and optimizationBAP
: Binary Analysis Platform. Written in OCaml. We can inspect customize IRBIL
to analyze different binary.Angr
: Binary Analysis and Symbolic Exectuion
- SVF: Program Analysis Framework based on LLVM
- Infer: Source code static analysis based on OCaml
- CWE-Checker: Binary analysis based on BAP
- Klee: Symbolic Execution based on LLVM
Normally, the basic parts include dataflow analysis frameword (reachiing definition, interval analysis, ...), pointer analysis (andreson and steensgaard), and abstract interpretation (sign analysis). You are also encouraged to learn discrete math to understand the notations in text books.
-
Courses:
- UW CSE 501: Personally recommend, the contents are more compacted.
- CMU CS17-355: Some slides are missing. And the contents focus on security stuff more. Recommend to use notes here and slides from UW
- CMU CS15-414: Model checking related
- IOWA CS513X: The topic is about staitc analysis but slighlty more depth.
-
Books:
- SPA Book: Personally recommend. This book is static analysis oriented. It also provied a toy language analyszer. The psedu-code and syntax are better the PPA.
- Principle of Program Analysis: Old school book. The syntax is abstract. Might be too hard to understand.
- Program slicing: For some values your analyzer intereted, we can slice the program to find related part of the program which impacts thos values.
- Shape Analysis;
- Shape Analysis by WISC: Introduce shape analysis for heap
- Shape Analysis and Applications by UT
- Abstract Interpretation: 16.399 by MIT
- Abstract Machine: Abstract Machine primarirly discuss about the exact execution of a program
- [Abstract machines for programming language implementation] http://www.inf.ed.ac.uk/teaching/courses/lsi/diehl_abstract_machines.pdf
- Abstracting Abstract Machines: The name is so abstract
- Abstracting Definitional Interpreters: Solid foundation of semmantics
- Analyzing Memory Accesses in x86 Executables: Introduce value-set analysis. This analysis uses an abstract domain for representing an over-approximation of the set of values that each data object can hold at each program point.
- Decompile:
- Reverse Compilation Techniques: This book is awesome, all about decompiling from frontend to backend!!!