Introduction
CMake is a very popular build configuration generator for Fortran/C/C++ programs. The true power of it lies in its basics which is hard to find in a coherent way on the internet. This post, CMake part 1, aims to neatly mention important concepts, syntax, and commands of CMake as a programming language. For each command a reference to the manual is given for more details. I focus on installation, defining variables, if-conditions, loops, functions, and so on. Here, I ignore the commands to create libraries and executables because they will be explained in the next post.
Installation
First check if you already have it installed. In a terminal/PowerShell run
cmake --version
If it is not there, you can download CMake for Windows, MacOS, and Linux from here and install it.
In Windows, you can also install it using choco
, open a PowerShell as administrator and run
choco install cmake
CMake is usually installed on Linux distros by default. If not, it is included in their package manager. For example, on Ubuntu, you can install it via
sudo apt-get install cmake
Examples structure
The examples in this post are run similar to practical applications. There is a project folder that contains CMakeLists.txt
file and build
folder.
--myProject
|
----- CMakeLists.txt
|
|
------ build
The CMake script is written in CMakeLists.txt
. CMake is run from within build
folder by this command in a terminal:
cmake ..
or from anywhere in the file system:
cmake -S path/to/myProject -B path/to/build
CMake automatically detects CMakeLists.txt
and runs the commands in it.
Minimum requirement
CMake is an evolving language. New features are added to it every day. At the beginning of a CMake script, you can set the earliest CMake version that compiles your code correctly by,
cmake_minimum_required(VERSION 3.22)
Version 3.22 is the one I use for this post.
Project
Always set the name of the project in the script
project(LinearSystemSolver)
You can also set the project version and language: C, CXX (for C++), Fortran:
project(LinearSystemSolver VERSION 1.2.0 LANGUAGES CXX)
See the manual for project here.
Comment
A text starts with #
considered as a comment and CMake ignores it.
# This is a comment
cmake_minimum_required(VERSION 3.2)
project(LinearSystemSolver) # Another comment
A multi-line comment is created with #[[…]]:
#[[ this is
a long comment
for this code.]]
Script language
CMake is a dynamically-typed language like Python. CMake script is composed of commands. Each command ends with parentheses with some arguments and keywords.
- Commands are case insensitive,
- variables are case sensitive,
- Keywords are always written in upper case.
This two lines are the same:
command1(KEYWORD1 arg1)
COMMAND1(KEYWORD1 arg1)
Message
message
writes its parameter on screen like print()
in Python:
message("this is a message")
Message function has many modes for showing warnings, errors, and so on. Usually, mode STATUS
is used to inform users that a step is started or finished:
message(STATUS "The compilation is finished.")
Note that STATUS
is a CMake keyword and needs to be upper case.
Another important usage for message
is debugging a CMake script. You can write the value of any suspicious variable on the screen.
See the manual for message here.
Normal variable
In CMake, there is no data type like char, integer, float, or class. All variables are strings (or text). A variable is defined or changed with set
command:
set(x hello)
set(y "hello")
Both x
and y
are set to hello
. hello
is a constant or value.
I prefer quotation for constants because
- you emphasize that it is a constant
- white space handled correctly
set(z "Hi there")
Without quotations, spaces imply a list
variable, explained in list section.
The value of a variable is accessed with ${variable}
:
set(a "hi")
set(b ${a})
# b is "hi"
${a}
is expanded to hi
.
See the manual for set() here.
Variable operations
We can merge variables to create a new one:
set(myPath "/home")
set(myDir "projectA")
set(myFile "${myPath}/${myDir}/main.cpp")
Message(myFile) # will print /home/projectA/main.cpp
You have to dereference a variable to use its content (if-condition
is an exception, but ignore it now.):
set(file1 "sample.h")
set(header file1) # This is a Mistake!
here header
is set to text “file1”. We wanted it to be text “sample.h”, so the code is fixed like:
set(file1 "sample.h")
set(header ${file1})
Derefrencing can happen recursively:
set(a "Final")
set(b a)
message( b ) # shows b
message( ${b} ) # shows a
message( ${${b}} ) # shows Final
This example shows:
- CMake is funny,
- A variable stores only a string,
- Everything is a constant unless dereferenced to be treated as a variable (there are exceptions like
if-condition
andforeach
), - The concept of reference-to-reference or pointer-to-pointer,
- Why I like to set constants in quotations.
You may use tricks like this to have class-like data set:
set(folder1-header "/folder1/a.h")
set(folder1-source "/folder1/a.cpp")
set(folder2-header "/folder2/b.h")
set(folder2-source "/folder2/b.cpp")
set(folder folder1)
# comment the above line
# and uncomment the below line, see the message
#set(folder folder2)
message(${${folder}-header})
message(${${folder}-source})
Unset variable
A variable can be cleared/deleted:
unset(a)
set(a) # set without value
We can check if a variable is set by
if (a)
# do something
endif()
if-condition
A condition is written similiar to other programming languages:
if (<conditon1>)
# do something
elseif(<condition2>)
# do another thing
else()
# do default action
endif()
Some constant strings are translated to true/false:
- True constants: TRUE, 1, Yes, Y, ON, …
- False constants: FALSE, 0, NO, N, OFF, …
A variable can be put as a condition to test if it is set:
unset(a) # emphasizing a is not set
set(b "sample.cpp")
if(a)
# the code here not run
endif()
if (b)
# the code here is run
endif()
Note: In the above example, if-condition
checks whether its argument is a set variable, DO NOT use ${}
. Otherwise, if-condition
checks whether the value of the variable is a variable.
Various compounds can be made for conditions. They can be related with AND
and OR
, negated with NOT
, and separated by parentheses. The most useful operator is STREQUAL
to check if two variables are equal:
set(a "book")
set(b "book")
if (a STREQUAL b) # true condition
message("they are equal.")
endif()
You could also write if (${a} STREQUAL ${b})
, but I prefer if (a STREQUAL b)
because in this way we can have this rule:
Always use the name of a variable without ${}
in conditions.
We can compare a variable with a constant:
set(a "book")
if (a STREQUAL "book") # a true condition
message("they are equal.")
endif()
Every variable is a string but CMake can compare numbers with LESS
, EQUAL
, GREATER
, and so forth.
if(1 EQUAL 01.0) # a true condition
message("1 equal to 01.0")
endif()
See the manual for if() here.
List
A list is defined with set
as well, with space separation of items:
set(myFiles a.cpp b.cpp a.h)
Or in quotations with ;
separation:
set(myFiles "sample1.h;sample2.h")
set(main "x.cpp")
set(myFiles "sample1.h;sample2.h;${main}")
CMake comes with a variety of keywords to modify a list such as APPEND, POP_BACK, REMOVE_ITEM. For example, the previous example could be rewritten as:
set(myFiles "sample1.h;sample2.h")
set(main "x.cpp")
list(APPEND myFiles ${main})
# So now myFiles="sample1.h;sample2.h;x.cpp"
When passing a list to add_* commands (e.g., add_library, add_executable), pass it as ${myFiles}
. Do NOT pass it as "${myFiles}"
because it will pass the list as a single string including semicolons.
You can try printing myFiles via:
message(${myFiles})
# sample1.hsample2.hx.cpp
All items are printed one after the other without the semicolon separators, indicating that the variable is treated as a list. To properly print a list, see section foreach loop. However, if you write:
message("${myFiles}")
# sample1.h;sample2.h;x.cpp
The content of the variable is considered a single string, which is then printed.
See the manual for list() here.
foreach loop
A numerical loop is defined as
foreach( i RANGE 1 5)
message(${i})
endforeach()
# It will print 1 2 3 4 5
Note that the end of the range is included in contrast to Python.
A list can be iterated as
set(names "Jack;Kate;Sara")
foreach(name IN LISTS names)
message(${name})
endforeach()
# It will print Jack Kate Sara
Notice here, in foreach, the same as if-condition, we drop ${}
from variable.
See the manual for foreach here.
Cached Variable
The state of normal variables is lost after a cmake
run. To overcome this, we have cached variables which are written in CMakeCache.txt
file. Whenever we run cmake
command they are loaded from that file. These variables aim to store user preferences on disk. Some examples of user preferences are:
- installation directory,
- build type (release or debug),
- special compiler flags,
- option to install some libraries.
They are created and set firstever time that cmake
is called. They are defined with this template:
set(<variable> <value> CACHE <type> <docstring>)
An example is:
set(libAPath "/home/libA" CACHE PATH "info about libAPath for user")
After the first time, anymore cmake
is called, the line above will be ignored because libAPath
is already in the cache.
The idea behind it is that /home/libA
is the default value, and a user is responsible for changing it to something that suits their need. They can use, cmake -D
flag, ccmake
command or cmake-gui
to modify cached variables.
<type>
tells ccmake
and cmake-gui
what we are expecting to get from the user. Types are:
- FILEPATH: GUI shows a file selector dialog.
- PATH: GUI shows a directory selector dialog.
- STRING: GUI shows a textbox.
- BOOL: GUI shows a checkbox.
- INTERNAL: Hidden from GUI, for the developer.
A user can run the cmake-gui
from a terminal with
cmake-gui -S pathToSourceFolder -B pathToBuildFolder
You can also set a cached variable when running cmake with -D
flag:
cmake -D <var>:<type>=<value>
See this example:
cmake -D compilesModule1:BOOL=ON -S path/to/source -B path/to/build
Every time you set a variable with -D
, it overwrites the cached value.
Besides set
, another way to create a boolean (ON/OFF) cache variable in a script is option
:
option(hasModule1 "info about this option" ON)
which is the same as this:
set(hasModule1 "ON" CACHE BOOL "info about this option")
While it is not recommended, we can also overwrite a cached variable from the script every time cmake
is run using FORCE
:
set(libAPath "/home/libA" CACHE PATH "some info" FORCE)
Sometimes we want to store some variables on disk as a developer, but we don’t want them to be changed by the user, then we write
set(libAPath "/home/libA" CACHE INTERNAL "some info")
The INTERNAL
variables are global variables accessible in every scope. FORCE
is not necessary for them as they are always forced. Therefore, to work with them, we can write this:
if (NOT libAPath) # if it is not in the cache file
# set the default value
set(libAPath "/home/libA" CACHE INTERNAL "some info")
endif()
# work with libAPath
Never choose the same name for a cached and a normal variable unless you know what you are doing.
See the manual for set(), option(), and flags of cmake executable.
String
With string
command you can find-and-replace, manipulate, compare strings. You can even work with JSON strings. See below examples:
string(TOUPPER "hello" a) # a is set to "HELLO"
string(LENGTH "hello" b) # b is "5"
string(SUBSTRING "hello" 2 3 c) # c is "llo"
See the manual for string here.
Math
A math equation is solved with this template:
math(EXPR output_variable math_expression)
for example
math(EXPR x "5*(1+1-1)/5") # x will be 1
See the manual for math here.
File
With File
command you can
- read and write files,
- perform file system actions such as copy, remove, and rename files,
- upload or download files
- create or extract archives (zip, 7zip, …) and many more actions.
The below keywords are common to get the list of files in the project:
GLOB
for getting the list of files in the directory of the currentCMakeLists.txt
,GLOB_RECURSE
for getting the list of files in the current directory and all its subdirectories.
You have to set globbing expressions to find desired files. The example below finds the list of files with .h
and .cpp
in sub1
directory. The results are stored in myfiles
variable:
file(GLOB_RECURSE myfiles LIST_DIRECTORIES false ${PROJECT_SOURCE_DIR}/sub1/*.cpp ${PROJECT_SOURCE_DIR}/sub1/*.h)
The last two terms are globbing expressions, you can add as many globbing expressions as you like.
Note that CMake doesn’t recommand GLOB
is not recommended for collecting source files. For more info, see the manual for file() here.
Function
A function in CMake is defined as
function(NameOfFunction arg1 arg2)
# body of function
endfunction()
A function that prints its arguments
function(print a b)
message("${a} ${b}")
endfunction()
print("March" "May")
The arguments are stored in ARGV
list, so for a function which accepts different number of arguments, we write:
function(print)
foreach(arg IN LISTS ARGV)
message(${arg})
endforeach()
endfunction()
print("March" "May" "June")
The parameters set in a function are local to the scope of function and not accessible outside:
function(doSomething)
set(name "Sara")
endfunction()
doSomething()
if (NOT name)
message("name is not set!") # This line is reached
endif()
However, a function has access to copy of variables in the scope it is called i.e. a function has access to a copy of variables in its parent scope:
set(name "Sara")
function(doSomething)
message(${name})
endfunction()
doSomething() # prints Sara
We say it has access to a copy of the parent scope because if you change a parent variable in the function, it will not change it in the parent scope:
set(name "Sara")
function(doSomething)
set(name "Jack")
message(${name})
endfunction()
doSomething() # Jack
message(${name}) # Sara
If you are willing to do so, you have to set the variable again with PARENT_SCOPE
:
set(name "Sara")
function(doSomething)
# for local scope
set(name "Jack")
# for parent scope
set(name "Jack" PARENT_SCOPE)
message(${name})
endfunction()
doSomething() # Jack
message(${name}) # Jack
Now we can define a function that returns a variable to its parent
function(findName outFullName first last)
set(${outFullName} "${first} ${last}" PARENT_SCOPE)
endfunction()
findName(fullName "Steve" "Jobs")
message(${fullName})
A function can be terminated with return()
command.
See the manual for function here.
Macro
Macro is defined the same as a function. However, while a function hides its content from a caller scope, macro pastes its content at the caller’s place. Therefore, the variables and commands defined in the macro will be exposed to the caller scope.
macro(setName)
set(name "Sara")
endmacro()
setName()
message(${name})
In function, variable ARGV
contains a list of arguments, but in macros. ${ARGV}
does so.
Macro vs Function
Generally, a function is the first pick as it leads toward clean code and less bug. A macro can be used for wrapping commands that make some changes like setting some variables in the scope they are called.
Manual for macro is here.
Special CMake variables
Any variable that starts with CMAKE_
is a reserved variable for CMake. It will populate them when a script is run.
add_subdirectory
In any subdirectory of a project, you can have a CMakeLists.txt
file. Imagine we have a file system like this
-- myProject
|
----- build
|
----- library1
| |
| ---- CMakeLists.txt
|
------ CMakeLists.txt
In library1/CMakelists.txt
we have
message("Hello from library1")
message(${CMAKE_CURRENT_SOURCE_DIR})
message(${PROJECT_SOURCE_DIR})
And in myProject/CMakeLists.txt
we have this line
project(myProject)
message("Hello from myProject")
message(${CMAKE_CURRENT_SOURCE_DIR})
message(${PROJECT_SOURCE_DIR})
add_subdirectory("library1")
Running cmake
, it will set CMAKE_CURRENT_SOURCE_DIR
variable to the myProject
path. When it reaches the add_subdirectory
line, it will jump into library1/CMakeLists.txt
, set CMAKE_CURRENT_SOURCE_DIR
to library1
path, and runs the command there. Afterward, CMake comes back to the project scope and sets CMAKE_CURRENT_SOURCE_DIR
to myProject
path. The variables in subdirectory scope are private and not visible to the project scope.
While CMAKE_CURRENT_SOURCE_DIR
is dependent on the location of focused CMakeLists.txt
, PROJECT_SOURCE_DIR
is always set to the top-level folder containing CMakeLists.txt
which has project()
command in it.
A subdirectory usually contains some source files that need to be compiled.
Include
We can include CMake scripts from another file by include
. No new scope is created as if the content of the file is pasted at the include()
line.
Let’s create a file in the project folder, sample.txt
which contains
message("Hello from sample.txt")
In CMakeLists.txt
file include it as:
include("sample.txt")
It will write the message on the screen.
If the file has the extension of .cmake
, it is called a module and we don’t need to mention its extension. It is common to add the folder containing modules to CMAKE_MODULE_PATH
list variable. So, include
command automatically search those folders for the mentioned module.
For example, let’s create mymodules
folder in the project directory. In that directory we put sample.cmake
module contains
message("Hello from sample module")
So the file system will look like this
--myProject
|
----- build
|
----- mymodules
| |
| --- sample.cmake
|
----- CMakeLists.txt
Now in CMakeLists.txt
we can include the module as
project(myProject)
list(APPEND CMAKE_MODULE_PATH "${PROJECT_SOURCE_DIR}/mymodules")
include(sample)
Running cmake
you will see the message.
The difference between include
and add_subdirectory
is:
include
is used to add modules that may contain functions, macros, instruction to install packages and so forth.add_subdirectory
is used to add folders that contain source code to be compiled.
See manual for include() here.
More on CMake
The part 2 and 3 of this series are
Latest Posts
- A C++ MPI code for 2D unstructured halo exchange
- Essential bash customizations: prompt, ls, aliases, and history date
- Script to copy a directory path in memory in Bash terminal
- What is the difference between .bashrc, .bash_profile, and .profile?
- A C++ MPI code for 2D arbitrary-thickness halo exchange with sparse blocks