The Data Parallel essentials for Python workshop demonstrates high-performing code targeting Intel XPUs using Python. The talk will introduce basics of Numba and how to write parallel Python programs using Numba. The talk also introduces Numba-dppy with examples of how to write data-parallel code inside numba.jit decorated functions and offload them to a SYCL device. We will see examples of how to write an explicit kernel using the @numba_dppy.kernel decorator. Numba-dppy is packaged as part of Intel Distribution for Python*, which is included with the Intel oneAPI AI Analytics Toolkit.
The talk will cover dpctl, a companion library intended to make it easier to write Python native extensions based on SYCL. Dpctl provides a Python binding for the DPCPP runtime classes, an API to manage devices and wrappers for the Unified Shared Memory (USM) allocators. This enables the creation of Python objects that use SYCL USM for data allocation.
- Setup Jupyter Lab environment for training and hands-on execution of code samples.
- Introduce Numba-dppy and show examples of how to write parallel code and perform an automatic offload approach using @numba.jit decorator.
- For programmers with SYCL and GPU programming experience, Code walk-thru of writing of how to write an explicit kernel using the @numba_dppy.kernel decorator.
- Code walk-thru of using Data Parallel Control (dpctl) to manage different devices and writing the classes and the functions of dpctl (platform, device, device selector, and queue).
- Code walk-thru of how to use the Unified Shared Memory (USM) manager to create Python objects that use SYCL USM for data allocation.