pybind11: call OpenMP function in Python


I want to use pybind11 to compile an OpenMP parallel code, then load and run it in Python. The code is written in C++.


I am using Ubuntu 21.04, GCC 10.3, CMake 3.20-rc2.

Install pybind11

Open a terminal, download pybind11 first:

git clone

Go into the downloaded directory and run

mkdir build
cd build
cmake ..
make check -j 4

These commands compile pybind11 and run its tests. You should see all the tests pass, otherwise, there is a problem with your system which needs to be fixed.

Set Headers

The compilers like GCC and Clang look into the environment variable CPATH for headers. We add the path of python and pybind11 headers to CPATH:

# Run in a terminal
export CPATH=/home/sorush/workspace/pybind11/include:/usr/include/python3.9/:$CPATH 

Change the pybind11 and Python include path to yours. If not sure where the Python is placed, run

sudo find / -iname 'Python.h'

The result of this command is where python headers are placed. If nothing found, you need to install python3-dev:

# Ubuntu terminal
sudo apt install python3-dev

You can add the export of CPATH to ~/.bashrc file in Ubuntu, so whenever you open a new terminal CPATH is updated.


The C++ code with comments:

// example.cpp file
#include <pybind11/pybind11.h>
#include <omp.h> // OpenMP header
#include <unistd.h> // sleep() function

namespace py=pybind11;

// Sums the id of all threads
int sum_thread_ids() {
    int sum=0;
    #pragma omp parallel shared(sum)
        #pragma omp critical
        sum += omp_get_thread_num();
    return sum;

PYBIND11_MODULE(example, m) {
    m.def("get_max_threads", &omp_get_max_threads, "Returns max number of threads");
    m.def("set_num_threads", &omp_set_num_threads, "Set number of threads");
    m.def("sum_thread_ids", &sum_thread_ids, "Adds the id of threads");

Note that omp_get_max_threads and omp_set_num_threads are defined in OpenMP library.


To compile the code, in a terminal run

c++ -O3 -Wall -std=c++11 -shared -fPIC example.cpp -o example$(python3-config --extension-suffix) -fopenmp

This produces a file like

Python Path

To have the module accessible everywhere, we add the new module address to the Python path, run the below code in a terminal or add it to ~/.bashrc:

export PYTHONPATH=/path/to/example/directory/:$PYTHONPATH


Run Python in a terminal:


Then step by step run the below commands:

import example
# My Python shows: 16

# Set number of threads =< max number of threads

# Call the function:
# After 3 seconds, shows: 6

sum_thread_ids() function put CPUs to sleep for 3 seconds. If you run a CPU intensive function, You can open another terminal and run htop to watch the activity of CPUs.


Python Global Interpreter Lock (GIL) only allows one thread to run a Python script. I don’t think it is a problem for the goal of this post: calling a C++ multi-thread function from Python. However, It can be problematic if a Python code is executed within a C++ multi-thread function.


pybind11 GitHub pybind11 Docs

Tags ➡ C++ Python HPC


I notify you of my new posts

Latest Posts


0 comment