Andy Terrel

Feb 5, 2025 • WeAreDevelopers LIVE

CUDA in Python

What if your Python code could achieve over 90% of a GPU's theoretical max performance? Learn how NVIDIA is making it possible.

#1about 6 minutes

Understanding the CUDA platform stack for Python developers

The CUDA platform is layered from high-level domain libraries to low-level hardware access, with new tools aiming to combine Python's productivity with GPU performance.

#2about 3 minutes

Improving performance by fusing GPU operations

The nvmath-python library enables kernel fusion using epilogues, which combines multiple operations like matrix multiplication and bias addition into a single GPU kernel launch.

#3about 5 minutes

Calling device-side functions directly from Python kernels

Python kernels can now directly call pre-compiled, high-performance device-side functions from libraries like cuBLAS, enabled by a just-in-time linker called nvJitLink.

#4about 2 minutes

Fine-grained parallelism with cooperative groups in Python

The CUB library is exposed to Python, allowing for cooperative operations and reductions at the block or warp level for fine-grained control over GPU parallelism.

#5about 3 minutes

Accelerating language support with numba-cuda and nupack

The numba-cuda module is separated to accelerate feature delivery, while nupack automatically generates Python bindings for C++ templated code.

#6about 4 minutes

A Pythonic object model for host-side GPU control

A new high-level object model allows Python developers to directly manage GPU resources like devices, contexts, streams, and linker objects without boilerplate code.

10 days ago

Part Time Junior Python Backend / GenAI Support Intern

Eltemate
Amsterdam, Netherlands

Remote

Junior

17 days ago

Senior Machine Learning Engineer (f/m/d)

MARKT-PILOT GmbH
Stuttgart, Germany

Remote

Senior

11 days ago

Senior Agentic Data Scientist

Dynatrace
Linz, Austria

Senior

Featured Partners

Accelerating Python on GPUs

Accelerating Python on GPUs

Paul Graham

about 2 years ago • WeAreDevelopers LIVE

Python: Behind the Scenes

Python: Behind the Scenes

Diana Gastrin

about 2 years ago • World Congress 2023

Vectorize all the things! Using linear algebra and NumPy to make your Python code lightning fast.

Vectorize all the things! Using linear algebra and NumPy to make your Python code lightning fast.

Jodie Burchell

about 3 years ago • WeAreDevelopers LIVE

Concurrency in Python

Concurrency in Python

Fabian Schindler

about 3 years ago • WeAreDevelopers LIVE

Overview of Machine Learning in Python

Overview of Machine Learning in Python

Adrian Schmitt

about 2 years ago • WeAreDevelopers LIVE

Python-Based Data Streaming Pipelines Within Minutes

Python-Based Data Streaming Pipelines Within Minutes

Bobur Umurzokov

about a year ago • WeAreDevelopers LIVE

A beginner’s guide to modern natural language processing

A beginner’s guide to modern natural language processing

Jodie Burchell

about 2 years ago • WeAreDevelopers LIVE

30 Golden Rules of Deep Learning Performance

30 Golden Rules of Deep Learning Performance

Anirudh Koul

about 5 years ago • WeAreDevelopers LIVE

From learning to earning

Jobs that call for the skills explored in this talk.

8 days ago

Platform Engineer

Nvidia

Remote

€60-75K

Java

Python

Kubernetes

+1

8 days ago

Python Backend Developer (AI)

Quadcode

Remote

Intermediate

API

Redis

Python

Docker

+4

8 days ago

Python Backend Developer (AI)

Quadcode

Remote

Intermediate

API

Redis

Python

Docker

+4

8 days ago

AI Developer

Nvidia
Zürich, Switzerland

Intermediate

C++

Machine Learning

4 days ago

Software Architect - Deep Learning and HPC CommunicationsNVIDIA

Nvidia
Bramley, United Kingdom

Senior

C++

Linux

Node.js

PyTorch

TensorFlow

today

Senior System Software Engineer, NCCL - Partner Enablement

Nvidia

Remote

Senior

C++

Azure

Linux

Python

+10

today

Python Developer

Intuition IT Solutions Ltd
Charing Cross, United Kingdom

€104K

API

Python

Microservices

Agile Methodologies

+2

6 days ago

Application Developer with Python

N Consulting Ltd
Charing Cross, United Kingdom

€104-117K

Java

Python

Amazon Web Services (AWS)

4 days ago

Senior Python Developer - AI Solutions

LurNova
Municipality of San Sebastian, Spain

€35-40K

Senior

Redis

Python

Docker

FastAPI

+2