Previously, we mentioned that each LLVM backend has a separate directory with
most of its code in
llvm/lib/Target. Also, LLVM relies on CMake to generate
the build files for an actual build system like Make, Ninja, etc. In this post,
we will take a close look at some of the CMake build files.
NOTE: The code for the RISCW backend can be found here.
NOTE: Posts in this series:
- Getting Started
- Setting Up a New Backend
- Configuring the Build System
- Instruction Selection
- Arithmetic Instructions
Build Files in the Target Backend Directory
Recall that the RISCW backend has its own directory at
That directory, along with any subdirectory, has two build files, i.e.
LLVMBuild.txt, that describe the structure of the
backend module, provide linking information, etc. The RISCW backend is
currently very simple, so the directory structure looks like this:
|- Other C++, TableGen, etc files
| |- CMakeLists.txt
| |- LLVMBuild.txt
| |- Other C++, TableGen, etc files
|- Other C++, TableGen, etc files
CMakeLists.txt is nothing particularly special – every build system
relying on CMake would have them.1 These file simply contain CMake commands
to drive the generation of the build files. In LLVM backends the commands do
things like indicating the C++ source files to compile or generating TableGen
execution commands. The
LLVMBuild.txt files are descriptions of the
components in the current directory.
Lets look at the contents of
subdirectories = MCTargetDesc TargetInfo
type = TargetGroup
name = RISCW
parent = Target
has_asmprinter = 1
type = Library
name = RISCWCodeGen
parent = RISCW
required_libraries = AsmPrinter CodeGen Core MC RISCWDesc RISCWInfo
SelectionDAG Support Target
add_to_library_groups = RISCW
This file tells the build system that the RISCW backend has two components.
There is a top-level target called
RISCW of type
TargetGroup. This type
is used by backend and the build system actually has some special treatment for
Target is the parent component of
RISCW. Note that this naming
matches the directory structure of LLVM backends i.e.
RISCW name is not arbitrary, it must match the definitions in
the backend’s TableGen files.
LLVM backends can offer a range of optional features like assembly printing,
assembly parsing, etc. The
LLVMBuild.txt file indicates which features are
supported. For example, the last line tells the build system that the
backend only implements assembly printing.
NOTE: Take a look at the code from existing backends, like ARM or RISCV, to
see what other features they enable besides
LLVMBuild.txt file also defines a second component called
Library whose parent is
what libraries it requires, e.g. AsmPrinter, CodeGen, etc; missing libraries
give rise to linking errors when building LLVM.
Each subdirectory of
llvm/lib/Target/RISCW also contains a
file defining its own component with
RISCW as a parent.
Lets consider the contents of
tablegen(LLVM RISCWGenRegisterInfo.inc -gen-register-info)
tablegen(LLVM RISCWGenInstrInfo.inc -gen-instr-info)
# Other TableGen commands
# RISCWCodeGen should match with LLVMBuild.txt RISCWCodeGen
# Other files
# Should match with "subdirectories = MCTargetDesc TargetInfo" in LLVMBuild.txt
set command at the top of the file defines
RISCW.td. This file normally contains a few top-level definitions and pulls
in other TableGen definitions by including other files – we will look more
carefully at TableGen in later posts, but feel free to take a look at
RISCW.td in the repository.
We then see a few
tablegen commands in the CMake file. These tell the build
system that it should create the
RISCWGen*.inc files from
*.td files in
the backend using the TableGen utility. These
RISCWGen*.inc files are actually
conventional C++ code and you can find them in the build directory at
build/lib/Target/RISCW after compiling LLVM; they can sometimes be useful for
debugging if a little hard to read.
The next command, i.e.
add_llvm_target, specifies the C++ files to build in
the current directory, but not its subdirectories. The first argument to the
add_llvm_target command is the name of the target and should match the
component name defined in
CMakeLists.txt contains commands indicating the subdirectories
that CMake should look at. In the RISCW case, there are only two:
CMakeLists.txt in the subdirectories are similar, but
a lot simpler!
NOTE: The naming convention of the files in the backend is often important
and should obviously match the contents of the build files. For example, the
set(LLVM_TARGET_DEFINITIONS RISCW.td) CMake command requires that
exists! Make sure you double-check these because the build errors can be a