Using PoCL as the OpenCL backend for DPC++¶
SYCL is a programming model that enables single-source C++ development for heterogeneous computing. Compared to OpenCL, SYCL operates at a higher level of abstraction, and implementations can use varying backends for device offloading (e.g., OpenCL, Level Zero, and CUDA). It is worth noting that a SYCL implementation is not required to support OpenCL as a backend.
DPC++ is Intel’s implementation of SYCL that supports OpenCL. When the OpenCL backend is utilized, the DPC++ runtime translates SYCL API calls into corresponding OpenCL API calls and forwards them to the OpenCL runtime.
The toolchain flow, when PoCL is used as the OpenCL backend for DPC++, is as follows:
The DPC++ Clang++ frontend compiles the SYCL kernel into LLVM IR.
llvm-spirvis used to translate LLVM IR to SPIR-V.SPIR-V is ingested by PoCL, where it is translated back into LLVM IR.
PoCL applies additional transformations to the LLVM IR.
If using a CPU driver, PoCL leverages
llc(LLVM backend) to lower the kernel to machine code.
It should be pointed out that there are two versions of DPC++:
the Intel(R) oneAPI DPC++/C++ Compiler
the oneAPI DPC++/C++ Compiler.
The former is proprietary and thus distributed in binary form, whereas the latter is open-source.
This page covers the following steps:
How to obtain, install, and set up DPC++ (the proprietary or the open-source version)
How to build PoCL to support DPC++.
Verification with an example program.
Intel(R) oneAPI DPC++/C++ Compiler installation¶
DPC++ is available in various bundles. Installing the oneAPI Base Toolkit is the simplest way to install DPC++ and its dependencies.
Choose a suitable installer from:
https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html
Run the installer. The page above provides corresponding instructions for the selected installer. Pay attention to the default installation path and choose a suitable one if necessary.
The oneAPI Base Toolkit includes various components, some of which are not needed to run SYCL applications with PoCL.
For a minimal setup, pick:
Intel oneAPI DPC++ Library
Intel oneAPI DPC++/C++ compiler
Intel Distribution for GDB (Required by the compiler)
Intel oneAPI Threading Building Blocks (Required by the compiler)
Intel oneAPI Math Kernel Library (Useful, but not required here)
After installation, run the initialization script to set the environment variables:
source <path-to-oneapi-installation>/setvars.sh
Important
setvars.sh must be run in every new shell session unless added to .bashrc (or an equivalent).
Now, DPC++ should be set up. This can be verified by checking the available SYCL backends (In this example, Intel OpenCL was detected).:
sycl-ls
[opencl:cpu][opencl:0] Intel(R) OpenCL, AMD Ryzen Threadripper 2990WX 32-Core Processor OpenCL 3.0 (Build 0) [2024.18.12.0.05_160000]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO [23.43.027642]
The initialization script also adds the compiler to the PATH:
icpx --version
Intel(R) oneAPI DPC++/C++ Compiler 2025.0.4 (2025.0.4.20241205)
oneAPI DPC++/C++ Compiler installation¶
The sources for the open-source DPC++ compiler can be obtained from the DPC++ repository.
Official detailed instructions can be found here.
The build process is managed using two Python scripts: configure.py and compile.py, which handle most of the heavy lifting.
The configure.py is essentially a wrapper for CMake, so checking its contents can provide further details.
For a basic setup, run:
git clone git@github.com:intel/llvm
cd llvm
python3 ./buildbot/configure.py -o <path-to-dpcpp-installation>
python3 ./buildbot/compile.py -o <path-to-dpcpp-installation> -j <number-of-threads>
After building, export the compiler and SYCL runtime library paths:
export PATH=<path-to-dpcpp-installation>/bin:$PATH
export LD_LIBRARY_PATH=<path-to-dpcpp-installation>/lib:$LD_LIBRARY_PATH
Note: The open-source DPC++ compiler driver is clang++, not icpx.
which clang++
<path-to-dpcpp-installation>/bin/clang++
Building PoCL for DPC++¶
PoCL doesn’t normally require llvm-spirv, but in this case, it is a strict
dependency because PoCL needs to convert the SPIR-V produced by DPC++ back to LLVM IR.
You must check out and build a version of llvm-spirv that corresponds to the LLVM
version PoCL uses as its kernel compiler. For example, if the PoCL kernel compiler uses
LLVM 18, then llvm-spirv should be checked out from the llvm_release_180 branch.
Note
DPC++ ships with its own llvm-spirv, which is typically based on the latest release.
However, this version is intended for internal usage by DPC++ and cannot be used by PoCL.
Warning
Although the versions of llvm-spirv used by DPC++ and PoCL do not have to be an exact
match, it is recommended to use versions that are reasonably close to each other.
Example PoCL build:
git clone git@github.com:pocl/pocl.git
cd pocl
mkdir build && cd build
cmake .. -DCMAKE_INSTALL_PREFIX=<path-to-installation-directory> -DLLVM_SPIRV=<path-to-llvm-spirv> -DWITH_LLVM_CONFIG=<path-to-llvm-config>
ninja install
To make PoCL visible to the ICD loader, either register the PoCL ICD (https://github.com/KhronosGroup/OpenCL-ICD-Loader#registering-icds)
or set the OCL_ICD_FILENAMES or OCL_ICD_VENDORS environment variables. OCL_ICD_VENDORS only works on Linux/Android,
whereas OCL_ICD_FILENAMES works on all platforms (see https://github.com/KhronosGroup/OpenCL-ICD-Loader#table-of-debug-environment-variables
for more information).
On Linux:
export OCL_ICD_VENDORS=<path-to-pocl-installation>/etc/OpenCL/vendors
Compiling with DPC++ using PoCL as the backend¶
If using proprietary DPC++, there is one additional step.
By default PoCL is blocked by the DPC++ runtime. To enable PoCL, we need to set the SYCL_DEVICE_ALLOWLIST
environment variable. This variable is a comma-separated list of parameters that the DPC++ runtime uses to select allowed devices.
It can be used quite flexibly. For example, to select only CPU devices:
export SYCL_DEVICE_ALLOWLIST="DeviceType:cpu"
To allow all available devices, use:
export SYCL_DEVICE_ALLOWLIST=""
To select only PoCL, you can use the PoCL vendor ID:
export SYCL_DEVICE_ALLOWLIST="DeviceVendorId:0x10006"
For more information about how to use the DPC++ environment variables, see:
https://github.com/intel/llvm/blob/sycl/sycl/doc/EnvironmentVariables.md
Now that everything is set up, verify that PoCL is detected:
sycl-ls
[opencl:cpu][opencl:0] Portable Computing Language, cpu-znver1-AMD Ryzen Threadripper 2990WX 32-Core Processor OpenCL 3.0 PoCL HSTR: cpu-x86_64-pc-linux-gnu-znver1
Below is a simple SYCL program to test the setup. It selects the device automatically, so this will drop the possible GPUs out of the list:
export ONEAPI_DEVICE_SELECTOR=opencl:cpu
// hello_nd_range.cpp
#include <sycl/sycl.hpp>
#include <iostream>
#define SUB_GROUP_SIZE 2
using namespace sycl;
int main() {
constexpr int global_size = 8;
constexpr int local_size = 4;
queue q;
{
q.submit([&](handler &h) {
std::cout << "One dimensional nd_range with global_size: " << global_size << ", local_size: " << local_size << ", sg_size: " << SUB_GROUP_SIZE << "\n";
range<1> global(global_size);
range<1> local(local_size);
nd_range<1> range(global, local);
h.parallel_for(range, [=](nd_item<1> idx) [[sycl::reqd_sub_group_size(SUB_GROUP_SIZE)]] {
int workgroup_id_x = idx.get_group(0);
int global_id_x = idx.get_global_id(0);
int local_id_x = idx.get_local_id(0);
int sg_local_id = idx.get_sub_group().get_local_id();
int sg_id = idx.get_sub_group().get_group_id();
sycl::ext::oneapi::experimental::printf("hello from: (global_id %d) (local_id: %d) (wg_id: %d) (sg_id: %d) (sg_local id: %d)\n",global_id_x, local_id_x,workgroup_id_x, sg_id, sg_local_id);
});
}).wait();
}
return 0;
}
Compile and run (use icpx for proprietary version, and clang++ for open-source version):
clang++ hello_nd_range.cpp -fsycl -o hello
./hello
One dimensional nd_range with global_size: 8, local_size: 4, sg_size: 2
hello from: (global_id 0) (local_id: 0) (wg_id: 0) (sg_id: 0) (sg_local id: 0)
hello from: (global_id 1) (local_id: 1) (wg_id: 0) (sg_id: 0) (sg_local id: 1)
hello from: (global_id 2) (local_id: 2) (wg_id: 0) (sg_id: 1) (sg_local id: 0)
hello from: (global_id 3) (local_id: 3) (wg_id: 0) (sg_id: 1) (sg_local id: 1)
hello from: (global_id 4) (local_id: 0) (wg_id: 1) (sg_id: 0) (sg_local id: 0)
hello from: (global_id 5) (local_id: 1) (wg_id: 1) (sg_id: 0) (sg_local id: 1)
hello from: (global_id 6) (local_id: 2) (wg_id: 1) (sg_id: 1) (sg_local id: 0)
hello from: (global_id 7) (local_id: 3) (wg_id: 1) (sg_id: 1) (sg_local id: 1)