Is your Python code hitting a performance wall? Learn how to leverage the massive parallelism of GPUs with minimal code changes.
#1about 2 minutes
The rise of general-purpose GPU computing
NVIDIA's evolution from a graphics hardware company to a leader in general-purpose computing was accelerated by the use of GPUs for AI with models like AlexNet.
#2about 4 minutes
Why GPUs outperform CPUs for parallel tasks
As single-threaded CPU performance plateaued, GPUs offered a path forward with their massively parallel architecture designed for simultaneous computation.
#3about 6 minutes
Understanding modern GPU architecture and operation
GPUs work with CPUs by offloading compute-intensive code and use thousands of threads to hide memory latency, leveraging streaming multiprocessors and high-bandwidth memory.
#4about 7 minutes
Introducing the CUDA parallel computing platform
The CUDA platform is a complete ecosystem with compilers, libraries, and frameworks that enables developers to program GPUs using various languages and abstraction levels.
#5about 3 minutes
Leveraging specialized hardware like Tensor Cores
Specialized hardware like Tensor Cores can be used transparently through high-level libraries like cuDNN or programmed directly with low-level APIs for maximum performance.
#6about 6 minutes
High-level frameworks for domain-specific acceleration
Frameworks like Rapids provide GPU-accelerated, drop-in replacements for popular data science libraries such as Pandas (cuDF) and NetworkX (cuGraph) with minimal code changes.
#7about 10 minutes
A progressive approach to programming GPUs in Python
Developers can choose from a spectrum of Python libraries, from simple drop-in replacements like CuNumeric and CuPy to JIT compilers like Numba and direct kernel programming with PyCUDA.
#8about 6 minutes
Developer tools and learning resources for GPUs
NVIDIA offers a comprehensive suite of developer tools for profiling and debugging, along with learning resources like the NGC repository, DLI courses, and community events.
Related jobs
Jobs that call for the skills explored in this talk.
What’s the latest in NVIDIA CUDA PythonPython and NVIDIA CUDA have long been friends. Over the last year, NVIDIA teams are working to improve the Pythonista’s experience. This means a top-to-bottom update to the CUDA Platform is fueling the GenAI movement, e.g. llama3, gpt and nemo. These...
Daniel Cranney
Dev Digest 205: AI vs. OSS, Hidden ChatGPT Features, Linux in a PDFInside last week’s Dev Digest 205 .
😔 The end of the curl bug bounty
📝 Agent Skills vs. Rules vs. Commands
💬 The best hidden ChatGPT features
📅 Weaponising calendar invites
🟪 CSS in 2026
🐍 Python numbers you should know
👨💻 The Github Copilot SDK
💻 ...
Daniel Cranney
Dev Digest 157: CUDA in Python, Gemini Code Assist and Back-dooring LLMsInside last week’s Dev Digest 157 .
🕹️ Pong in 240 browser tabs
👩💻 Gemini Code Assist free for 180k code completions a month
📰 AI is bad at coding and summarising the news
🕵️ Private GitHub repos show up in AI chats
🐍 CUDA for Python developers
🖥️ ...