AI & Embedded ML Engineer (Real-Time Edge Optimization)

autonomous-teaming
München, Germany
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Remote
München, Germany

Tech stack

Artificial Intelligence
Artificial Neural Networks
C++
Profiling
Nvidia CUDA
Software Debugging
Linux
Field-Programmable Gate Array (FPGA)
Global Positioning Systems (GPS)
Python
Memory Leaks
Robotic Automation Software
Graphics Processing Unit (GPU)
PyTorch
GIT
Perf (Linux)

Job description

At Autonomous Teaming, we build autonomous robotic systems operating in extreme, GPS-denied environments. Our models run fully on edge hardware (Jetson, FPGA, custom boards), with no cloud, no fallback, no excuses. We're looking for an engineer who loves hard problems : real-time inference, low-latency pipelines, CUDA kernels, TensorRT graphs, and deploying ML models directly on hardware. If you enjoy debugging things that only break on the robot, this role is for you.

Missions : Own the full pipeline from model to real-time inference on embedded devices:

  • Optimize deep neural networks for Jetson, FPGA or ARM boards
  • Apply quantization, pruning, distillation to hit strict FPS, power and memory budgets
  • Convert & compile models using TensorRT, ONNX, CUDA, C++
  • Build ROS nodes integrating optimized perception into the full robotic system
  • Debug runtime failures, memory leaks, thermal throttling, kernel-level issues
  • Benchmark and validate performance directly on hardware
  • Ship models that run reliably in real-world, harsh environments, * You work on constrained hardware, where every millisecond and every watt matters
  • You solve problems that cloud ML engineers never face
  • You own your optimizations end-to-end : from model to field deployment
  • You work in a small, high-performing team where ownership is real

If you want a job with clean layers and abstract diagrams, this is not it.

Requirements

Do you have experience in Python?, * Strong experience in CUDA & C++

  • Hands-on work with TensorRT, ONNX, TVM or similar compilers
  • Practical experience with quantization/ pruning/ INT8 / FP16
  • Experience deploying models on Jetson/ embedded GPUs/ ARM / FPGA
  • Comfortable with PyTorch, Python, Linux, Git Engineer mindset : measurement optimization
  • validation

Nice-to-have

  • ROS (building nodes, integrating perception stacks)
  • Custom accelerators, DSPs or hardware-specific toolchains
  • Profilers : Nsight, perf, tegrastats, TensorRT profiler
  • Experience in robotics , autonomous systems, aerospace, automotive or defense

About the company

We are a defence-tech start-up specializing in machine vision solutions. If you have a passion for cutting-edge innovation, and drive to use your skills to create next generation solutions, this is an opportunity for you! What we do: We are developing solutions that enable computers and sensors to collaborate as teams, working together to address emerging security challenges. Our primary mission is to defend against AI-powered asymmetric threats at scale, such as drone swarms and other UXVs. Who we are: Based in Munich, Berlin and Bordeaux/Toulouse we are rapidly expanding across Europe with plans to open more office hubs soon. We embrace a hybrid work culture - valuing the collaborations that happens in the office, while also empowering our team members to work remotely with responsibility and autonomy.

Apply for this position