Profile DPC++ and GPU Workload Performance
If you’ve offloaded an application from CPU to GPU, uncovering how and where to effectively optimize performance can make your eyebrows furl. This webinar demonstrates how a oneAPI beta tool can simplify the process.
Developers who deploy applications across both CPUs to GPUs are often challenged to find the best methods for analyzing and optimizing offload performance.
In this webinar, technical consulting engineer Vladimir Tsymbal demonstrates how it can be done using the Intel® oneAPI beta version of Intel® VTune™ Profiler, a performance analysis tool that takes the guesswork out of cross-architecture performance improvements.
Using a sample application written in Data Parallel C++ (DPC++), Vladimir will demonstrate Intel VTune Profiler (beta) can be used to:
- Profile DPC++ code running on both host and GPU processors
- Collect the right data and turn it into rich, easily interpretable analysis
- Identify the hotspots in your compute kernels, including which are key areas for optimization
- Show how the GPU resources are being utilized and locate hardware bottlenecks
Get started with oneAPI
- Download Intel® VTune™ Profiler (beta) as part of the Intel® oneAPI Base Toolkit—an essential set of 15 software development tools and libraries optimized for diverse workloads and architectures.
- Develop in the Cloud—Sign up for an Intel® DevCloud account, a free development sandbox with access to the latest Intel® hardware and oneAPI software.
More Intel® VTune™ Profiler (beta) resources