From this assignment:
1. You will learn how to write an LLVM pass.
2. You will learn how to generate a control flow graph by analyzing basic blocks.
3. You will learn how to analyze a control flow graph using the LLVM API.
In the previous programming assignments for syntax analysis and sematic analysis, you worked with clang , the LLVM front-end for C, C++, and Objective-C.
The core of LLVM is the intermediate representation (IR), a low-level programming language similar to assembly, which abstracts away most details of the high-level programming language and also the target-machine-specific nuances. When compiling a programming language under LLVM, it will first convert the specific programming language into IR and then perform analysis/optimization techniques against the IR. Finally, it will generate the target binary code (e.g., x86, ARM, etc.).
Optimizations are implemented as passes that traverse some portion of a program to either collect information or transform the program. For more details about LLVM Pass, check out Writing an LLVM Pass.
We will work with the IR and the LLVM Pass Framework for this assignment. More specifically, we will learn how to create a control flow graph (CFG) from an LLVM IR, as well as how to perform lightweight analysis on the CFG. The analysis of control flow graphs is an essential part of the compiler for program optimization.
LLVM Version (IMPORTANT)
This assignment requires that your llvm version is 12.0.0. To check your llvm version, please run
1. Convert the bubble.c C program (from the examples directory) to an IR by running the following commands:
export LLVM_HOME=”<the absolute path to llvm-project>”;
clang -O0 -emit-llvm -c bubble.c
You will now receive a bubble.bc file, which contains the IR in binary format. You will also see a bubble.ll file, which contains the IR in human-readable format.
Go ahead and take a look at the contents of those files, and especially try to understand the structure of the bubble.ll file.
2. Generate the CFG from the bubble.bc file by running the following commands (again, examples is the directory we provided for you):
opt -dot-cfg < bubble.bc
dot -Tpdf .bubbleSort.dot -o bubbleSortDetailed.pdf
opt -dot-cfg-only < bubble.bc
dot -Tpdf .bubbleSort.dot -o bubbleSort.pdf
Now, you should be able to view the CFG of the function bubbleSort() in bubble.c by looking at the generated PDFs. Do you notice the difference between the two PDFs (hint: look at the names of the PDFs)? Similarly, you can view the CFGs of other functions via the dot command (e.g., dot -Tpdf .printArray.dot -o printArray.pdf for the printArray function).
3. Create a directory named clang-hw3 in llvm-project/llvm/lib/Transforms for this assignment, and copy the files from the src directory to this new directory, as follows:
cp -r ./src “$LLVM_HOME/llvm/lib/Transforms/clang-hw3”
4. Append add_subdirectory(clang-hw3) to the $LLVM_HOME/llvm/lib/Transforms/CMakeLists.txt file.
5. Build clang-hw3 by running the following commands (you should do this every time you make changes):
6. The nodes or vertices of the CFG are called basic blocks. In the next two subsections, we will describe your tasks for this assignment; you may assume that for each function, exactly one basic block will have ret as its terminator instruction, whereas other basic blocks will have br as their terminator instruction.
You may also assume that br will have at most 2 successors.