Senior Machine Learning Engineer (GPU Optimization)

Fintal, LLC

New York, United States of America

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

New York, United States of America

Tech stack

C++

Nvidia CUDA

Software Debugging

Distributed Computing Environment

Memory Management

Field-Programmable Gate Array (FPGA)

Machine Learning

Performance Tuning

TensorFlow

Scientific Computating

Software Engineering

System Programming

AI Infrastructure

High Performance Computing

Parallel Computation

Gpu Programming

Low Latency

Hardware Acceleration

Machine Learning Operations

C++14

Job description

Design, develop, and optimize ML models and inference pipelines for latency-sensitive trading applications.
Build high-performance GPU-accelerated systems using CUDA and modern C++.
Profile and optimize compute, memory, and networking bottlenecks across large-scale distributed environments.
Collaborate closely with researchers, quant traders, and infrastructure engineers to deploy production-grade ML solutions.
Drive performance improvements across GPU architectures, kernels, and training/inference workflows.

Requirements

Strong commercial experience in Machine Learning Engineering, Software Engineering, or High-Performance Computing environments.
Expert-level C++ development skills and deep understanding of low-level systems programming.
Extensive experience with CUDA, GPU programming, and performance optimization.
Proven track record optimizing GPU utilization, memory management, kernel performance, and distributed compute workloads.
Experience profiling and debugging performance-critical applications.
Background in quantitative finance, trading, scientific computing, AI infrastructure, or other high-performance environments is highly desirable.
Strong understanding of parallel computing, computer architecture, and modern ML frameworks.

Nice to Have

Experience within HFT, quantitative trading, or low-latency systems.
Familiarity with distributed training frameworks and large-scale ML infrastructure.
Knowledge of FPGA, networking, or hardware acceleration technologies.

About the company

Our client is a leading quantitative trading firm seeking a Senior Machine Learning Engineer to build and optimize next-generation ML infrastructure powering ultra-low-latency trading systems. This role sits at the intersection of machine learning, high-performance computing, and systems engineering, with a strong focus on GPU acceleration and performance optimization.

Role details

Job location

Tech stack

Job description

Requirements

About the company

Apply for this position

Good distractions

Moments

Videos View all