Andy Terrel
CUDA in Python
#1about 6 minutes
Understanding the CUDA platform stack for Python developers
The CUDA platform is layered from high-level domain libraries to low-level hardware access, with new tools aiming to combine Python's productivity with GPU performance.
#2about 3 minutes
Improving performance by fusing GPU operations
The nvmath-python library enables kernel fusion using epilogues, which combines multiple operations like matrix multiplication and bias addition into a single GPU kernel launch.
#3about 5 minutes
Calling device-side functions directly from Python kernels
Python kernels can now directly call pre-compiled, high-performance device-side functions from libraries like cuBLAS, enabled by a just-in-time linker called nvJitLink.
#4about 2 minutes
Fine-grained parallelism with cooperative groups in Python
The CUB library is exposed to Python, allowing for cooperative operations and reductions at the block or warp level for fine-grained control over GPU parallelism.
#5about 3 minutes
Accelerating language support with numba-cuda and nupack
The numba-cuda module is separated to accelerate feature delivery, while nupack automatically generates Python bindings for C++ templated code.
#6about 4 minutes
A Pythonic object model for host-side GPU control
A new high-level object model allows Python developers to directly manage GPU resources like devices, contexts, streams, and linker objects without boilerplate code.
Related jobs
Jobs that call for the skills explored in this talk.
Featured Partners
Related Videos
Accelerating Python on GPUs
Paul Graham
Python: Behind the Scenes
Diana Gastrin
Vectorize all the things! Using linear algebra and NumPy to make your Python code lightning fast.
Jodie Burchell
Concurrency in Python
Fabian Schindler
Overview of Machine Learning in Python
Adrian Schmitt
Python-Based Data Streaming Pipelines Within Minutes
Bobur Umurzokov
A beginner’s guide to modern natural language processing
Jodie Burchell
30 Golden Rules of Deep Learning Performance
Anirudh Koul
From learning to earning
Jobs that call for the skills explored in this talk.
Software Architect - Deep Learning and HPC CommunicationsNVIDIA
Nvidia
Bramley, United Kingdom
Senior
C++
Linux
Node.js
PyTorch
TensorFlow
Python Developer
Intuition IT Solutions Ltd
Charing Cross, United Kingdom
€104K
API
Python
Microservices
Agile Methodologies
+2
Application Developer with Python
N Consulting Ltd
Charing Cross, United Kingdom
€104-117K
Java
Python
Amazon Web Services (AWS)
Senior Python Developer - AI Solutions
LurNova
Municipality of San Sebastian, Spain
€35-40K
Senior
Redis
Python
Docker
FastAPI
+2


