Doc: update readme structure (#413)

DavidKorczynski · web-flow · commit ed2c96f19810 · 2022-07-25T23:01:47.000+01:00
* doc: refine and simplify

Removes a lot of the data on the front README to increase attention on
OSS-fuzz list of projects and modularises the documentation.
diff --git a/README.md b/README.md
@@ -5,252 +5,26 @@ and identify any potential blockers. Fuzz introspector aggregates the fuzzers’
 hit frequency, entry points, etc to give the developer a birds eye view of their fuzzer. This helps with 
 identifying fuzz bottlenecks and blockers and eventually helps in developing better fuzzers.
 
-Fuzz-introspector can on a high-level guide on how to improve fuzzing of a project by guiding on whether you should:
+Fuzz-introspector aims to improve fuzzing experience of a project by guiding on whether you should:
 - introduce new fuzzers to a fuzz harness
 - modify existing fuzzers to improve the quality of your harness.
 
-By and large these capabilities will remain the goals of fuzz-introspector. The focus is on improving these.
-
-**Video demonstration:** A video demonstration of fuzz-introspector is given [here](https://www.youtube.com/watch?v=cheo-liJhuE)
-
-**Sample OSS-Fuzz reports:** OSS-Fuzz supports fuzz introspector and maintains a list of fuzz introspector reports for the OSS-Fuzz projects. This list is available [here](https://oss-fuzz-introspector.storage.googleapis.com/index.html) and is a good reference if you're interested in seeing fuzz introspector in action. 
-
-**Try yourself:**
-- [Use with OSS-Fuzz](#testing-with-oss-fuzz)
-- [Use without OSS-Fuzz](#testing-without-oss-fuzz-integration)
+## Documentation and samples
+- [Sample OSS-Fuzz reports](https://oss-fuzz-introspector.storage.googleapis.com/index.html). [OSS-Fuzz](https://github.com/google/oss-fuzz) supports Fuzz Introspector and maintains a list of reports.
+- [Video demonstration](https://www.youtube.com/watch?v=cheo-liJhuE)
+- [List of Case studies](doc/CaseStudies.md)
+- [Screenshots](doc/ExampleOutput.md)
+- [Feature list](doc/Features.md)
+- Try yourself:
+  - [Use with OSS-Fuzz](oss_fuzz_integration#build-fuzz-introspector-with-oss-fuzz) (Recommended)
+  - [Use without OSS-Fuzz](doc/LocalBuild.md)
 
 ## Architecture
 The workflow of fuzz-introspector can be visualised as follows:
 ![Functions table](/doc/img/fuzz-introspector-architecture.png)
 
 A more detailed description is available in [doc/Architecture](/doc/Architecture.md)
 
-## Features
-**High-level features**
-
-- Show fuzzing-relevant data about each function in a given project
-- Show reachability of fuzzer(s)
-- Integrate seamlessly with OSS-Fuzz
-- Show visualisations to enable fuzzer debugging
-- Give suggestions for how to improve fuzzing
-
-**Concrete features**
-
-For each function in a project and a fuzzing harness for the project:
- - show the number of fuzzers (fuzz drivers) that reach this function
- - show cyclomatic complexity of the function
- - show the amount of functions reached by the function
- - show the sum of cyclomatic complexity of all functions reachable by the function
- - show the number of (LLVM IR) basic blocks in the function
- - show the function call-depth of the function
- - show the total unreached complexity of this function, including the complexity from all unreached functions reached by this function.
-
-Given a fuzz harness for a project show:
- - which functions in the project are not reachable by the harness
- - which functions in the project are reachable by harness
- - identify which functions are the best to target (based on which unreached function reaches most code)
-
-Given a fuzz harness statically analyse the code and merge it with run-time coverage information to:
- - visualise statically-extracted calltree of each fuzzer and overlay this calltree with run-time coverage information
- - identify nodes in the statically-extracted calltree where fuzzers are blocked based on run-time coverage information
- - automatically highlight fuzz-blockers, namely locations in the code where fuzzers are not able to continue execution at run-time despite the code being reachable by the fuzzer based on static analysis.
-
-## Testing with OSS-Fuzz
-The recommended way of testing this project is by way of OSS-Fuzz. Please see
-[OSS-Fuzz instructions](oss_fuzz_integration/) on how to do this. It is the
-recommended way because it's by way of OSS-Fuzz that we currently support combining
-runtime coverage data with the compiler plugin.
-
-
-## Testing without OSS-Fuzz integration
-You can build and run fuzz introspector outside the OSS-Fuzz environment.
-We use this mainly to develop the LLVM LTO pass as compilation of clang goes
-faster (recompilation in particular). However, for the full experience we 
-recommend working in the OSS-Fuzz environment as described above.
-
-A complication with testing locally is that the full end-to-end process of
-both (1) building fuzzers; (2) running them; (3) building with coverage; and
-(4) building with introspector analysis, is better supported
-in the OSS-Fuzz environment.
-
-
-### Build locally
-
-#### TLDR:
-```bash
-git clone https://github.com/ossf/fuzz-introspector
-cd fuzz-introspector
-
-# Get python dependencies
-python3 -m venv .venv
-. .venv/bin/activate
-pip install -r requirements.txt
-
-# Build custom clang with Fuzz introspector LLVM pass
-./build_all.sh
-
-cd tests
-./build_simple_example.sh
-cd simple-example-0/web
-python3 -m http.server 8008
-```
-
-#### Use Docker
-
-Will use sources cloned to /your/path/to/source
-
-```
-docker build  -t "fuzz-introspector:Dockerfile" .
-docker run --rm -it -v /your/path/to/source:/src fuzz-introspector:Dockerfile
-
-```
-
-#### Full process
-
-
-##### step 1: Start a python venv
-```bash
-git clone https://github.com/ossf/fuzz-introspector
-
-# create virtual environment
-python3 -m venv .venv
-. .venv/bin/activate
-pip install -r requirements.txt
-```
-
-##### step 2: Build custom clang
-Fuzz-introspector relies on an LTO LLVM pass and this requires us to build a custom Clang where the LTO pass is part of the compiler tool chain (see https://github.com/ossf/fuzz-introspector/issues/57 for more details on why this is needed).
-
-To build the custom clang from the root of this repository:
-
-```bash
-mkdir build
-cd build
-
-# Build binutils
-apt install texinfo
-git clone --depth 1 git://sourceware.org/git/binutils-gdb.git binutils
-mkdir build
-cd ./build
-../binutils/configure --enable-gold --enable-plugins --disable-werror
-make all-gold
-cd ../
-
-# Build LLVM and Clang
-git clone https://github.com/llvm/llvm-project/
-cd llvm-project/
-
-# Patch Clang to run fuzz introspector
-../../sed_cmds.sh
-cp -rf ../../llvm/include/llvm/Transforms/FuzzIntrospector/ ./llvm/include/llvm/Transforms/FuzzIntrospector
-cp -rf ../../llvm/lib/Transforms/FuzzIntrospector ./llvm/lib/Transforms/FuzzIntrospector
-cd ../
-
-# Build LLVM and clang
-mkdir llvm-build
-cd llvm-build
-cmake -G "Unix Makefiles" -DLLVM_ENABLE_PROJECTS="clang;compiler-rt"  \
-      -DLLVM_BINUTILS_INCDIR=../binutils/include \
-      -DLLVM_TARGETS_TO_BUILD="X86" ../llvm-project/llvm/
-make llvm-headers
-make -j5
-```
-
-##### step 3: Run local example
-
-Now we have two options, to run the fuzz introspector tools without collecting
-runtime coverage and doing it with collecting coverage. We go through each of the two options:
-
-##### step 3, option 1, only static analysis
-After having built the custom clang above, you build a test case:
-```
-# From the root of the fuzz-introspector repository
-cd tests/simple-example-0
-
-# Run compiler pass to generate *.data and *.data.yaml files
-mkdir work
-cd work
-FUZZ_INTROSPECTOR=1 ../../../build/llvm-build/bin/clang -fsanitize=fuzzer -flto -g ../fuzzer.c -o fuzzer
-
-# Run post-processing to analyse data files and generate HTML report
-python3 ../../../post-processing/main.py correlate --binaries_dir=.
-python3 ../../../post-processing/main.py report --target_dir=. --correlation_file=./exe_to_fuzz_introspector_logs.yaml
-
-# The post-processing will have generated various .html, .js, .css and .png fies,
-# and these are accessible in the current folder. Simply start a webserver and 
-# navigate to the report in your local browser (localhost:8008):
-python3 -m http.server 8008
-```
-
-
-##### step 3, option 2, include runtime coverage analysis
-```
-# From the root of the fuzz-introspector repository
-cd tests/simple-example-0
-
-# Run compiler pass to generate *.data and *.data.yaml files
-mkdir work
-cd work
-
-# Run script that will build fuzzer with coverage instrumentation and extract .profraw files
-# and convert those to .covreport files with "llvm-cov show"
-../build_cov.sh
-
-# Build fuzz-introspector normally
-FUZZ_INTROSPECTOR=1 ../../../build/llvm-build/bin/clang -fsanitize=fuzzer -flto -g ../fuzzer.c -o fuzzer
-
-# Run post-processing to analyse data files and generate HTML report
-python3 ../../../post-processing/main.py correlate --binaries_dir=.
-python3 ../../../post-processing/main.py report --target_dir=. --correlation_file=./exe_to_fuzz_introspector_logs.yaml
-
-# The post-processing will have generated various .html, .js, .css and .png fies,
-# and these are accessible in the current folder. Simply start a webserver and
-# navigate to the report in your local browser (localhost:8008):
-python3 -m http.server 8008
-```
-
-You can also use the `build_all_projects.sh` and `build_all_web_only.sh` scripts to control
-which examples you want to build as well as whether you want to only build the web data.
-
-
-## Output
-
-The output of the introspector is a HTML report that gives data about your fuzzer. This includes:
-
-- An overview of reachability by all fuzzers in the repository
-- A table with detailed information about each fuzzer in the repository, e.g. number of functions reached, complexity covered and more.
-- A table with overview of all functions in the project. With information such as 
-  - Number of fuzzers that reaches this function
-  - Cyclomatic complexity of this function and all functions reachable by this function
-  - Number of functions reached by this function
-  - The amount of undiscovered complexity in this function. Undiscovered complexity is the complexity *not* covered by any fuzzers.
-- A call reachability tree for each fuzzer in the project. The reachability tree shows the potential control-flow of a given fuzzer
-- An overlay of the reachability tree with coverage collected from a fuzzer run. 
-- A table giving summary information about which targets are optimal targets to analyse for a fuzzer of the functions that are not being reached by any fuzzer.
-- A list of suggestions for new fuzzers (this is super naive at the moment).
-
-### Example output
-
-Here we show a few images from the output report:
-
-Project overview:
-
-![project overview](/doc/img/project_overview.png)
-
-
-Table with data of all functions in a project. The table is sortable to make enhance the process of understanding the fuzzer-infrastructure of a given project:
-
-![Functions table](/doc/img/functions_overview.png)
-
-Reachability tree with coverage overlay
-
-![Overlay 1](/doc/img/overlay-1.png)
-
-
-Reachability tree with coverage overlay, showing where a fuzz-blocker is occurring
-![Overlay 2](/doc/img/overlay-2.png)
-
-
 ## Contribute
 ### Code of Conduct
 Before contributing, please follow our [Code of Conduct](CODE_OF_CONDUCT.md).
diff --git a/doc/ExampleOutput.md b/doc/ExampleOutput.md
@@ -0,0 +1,25 @@
+# Example output
+Here we maintain a list of images and descriptions of the output
+from Fuzz Introspector. This is mainly used to give potential users
+a quick insight of what you can get from Fuzz Introspector. For
+full examples we recommend inspecting OSS-Fuzz reports [here](https://oss-fuzz-introspector.storage.googleapis.com/index.html)
+
+Here we show a few images from the output report:
+
+Project overview:
+
+![project overview](/doc/img/project_overview.png)
+
+
+Table with data of all functions in a project. The table is sortable to make enhance the process of understanding the fuzzer-infrastructure of a given project:
+
+![Functions table](/doc/img/functions_overview.png)
+
+Reachability tree with coverage overlay
+
+![Overlay 1](/doc/img/overlay-1.png)
+
+
+Reachability tree with coverage overlay, showing where a fuzz-blocker is occurring
+![Overlay 2](/doc/img/overlay-2.png)
+
diff --git a/doc/Features.md b/doc/Features.md
@@ -0,0 +1,47 @@
+# Features
+**High-level features**
+
+- Show fuzzing-relevant data about each function in a given project
+- Show reachability of fuzzer(s)
+- Integrate seamlessly with OSS-Fuzz
+- Show visualisations to enable fuzzer debugging
+- Give suggestions for how to improve fuzzing
+
+**Concrete features**
+
+For each function in a project and a fuzzing harness for the project:
+ - show the number of fuzzers (fuzz drivers) that reach this function
+ - show cyclomatic complexity of the function
+ - show the amount of functions reached by the function
+ - show the sum of cyclomatic complexity of all functions reachable by the function
+ - show the number of (LLVM IR) basic blocks in the function
+ - show the function call-depth of the function
+ - show the total unreached complexity of this function, including the complexity from all unreached functions reached by this function.
+
+Given a fuzz harness for a project show:
+ - which functions in the project are not reachable by the harness
+ - which functions in the project are reachable by harness
+ - identify which functions are the best to target (based on which unreached function reaches most code)
+
+Given a fuzz harness statically analyse the code and merge it with run-time coverage information to:
+ - visualise statically-extracted calltree of each fuzzer and overlay this calltree with run-time coverage information
+ - identify nodes in the statically-extracted calltree where fuzzers are blocked based on run-time coverage information
+ - automatically highlight fuzz-blockers, namely locations in the code where fuzzers are not able to continue execution at run-time despite the code being reachable by the fuzzer based on static analysis.
+
+# Output
+Screenshots from Fuzz Introspector is is available [here](doc/ExampleOutput.md)
+
+The output of the introspector is a HTML report that gives data about your fuzzer. This includes:
+
+- An overview of reachability by all fuzzers in the repository
+- A table with detailed information about each fuzzer in the repository, e.g. number of functions reached, complexity covered and more.
+- A table with overview of all functions in the project. With information such as 
+  - Number of fuzzers that reaches this function
+  - Cyclomatic complexity of this function and all functions reachable by this function
+  - Number of functions reached by this function
+  - The amount of undiscovered complexity in this function. Undiscovered complexity is the complexity *not* covered by any fuzzers.
+- A call reachability tree for each fuzzer in the project. The reachability tree shows the potential control-flow of a given fuzzer
+- An overlay of the reachability tree with coverage collected from a fuzzer run.
+- A table giving summary information about which targets are optimal targets to analyse for a fuzzer of the functions that are not being reached by any fuzzer.
+- A list of suggestions for new fuzzers (this is super naive at the moment).
+
diff --git a/doc/LocalBuild.md b/doc/LocalBuild.md