Edge Deployment Engineer (AI & Embedded Systems)
Role details
Job location
Tech stack
Job description
Edge Deployment Engineer (AI & Embedded Systems) | AI Start-up | fixed-term Contract
Join a European deep-tech leader in quantum and AI.
A well-funded, fast-growing company backed by major global investors with its groundbreaking technology is already transforming AI, compressing large language models by up to 95% and cutting inference costs by 50-80%.
This is your chance to be part of a team often described as a "quantum-AI unicorn in the making."
This is a Hybrid opportunity in Zaragoza. It is a fixed-term contract to work until 30th June 2026
What You'll Do:
As an Edge Deployment Engineer, you will be instrumental in bridging the gap between cutting-edge AI research and efficient, real-world execution. You will specialise in optimising and deploying highly compressed Machine Learning and Large Language Models onto resource-constrained, low-latency devices.
As a Quality Control Engineer, you will:
- Implement and optimise deep-learning models for edge hardware.
- Reduce model size and latency using compression/quantisation.
- Work hands-on with embedded systems and systems programming.
- Utilise key inference optimisation frameworks (e.g., TensorRT, vLLM).
- Write high-performance code in Python, C, or C++.
- Conduct performance profiling on diverse embedded architectures (ARM, GPUs).
- Integrate ML models into final products through team collaboration.
- Maintain development standards: Git, testing, and CI/CD pipelines.
Requirements
- Bachelor's degree or higher in Computer Science, Electrical Engineering, Physics, or related field; or equivalent industry experience
- 3-5 years of hands-on experience in embedded systems, firmware development, or systems programming
- Demonstrated experience optimizing machine learning models for deployment on constrained devices
- Strong proficiency in Python, C, or C++; experience with system-level programming languages is essential
- Solid understanding of quantization techniques and model compression strategies Experience with inference optimization frameworks (TensorRT, ONNX Runtime, LLM, vLLM, or equivalent)
- Familiarity with embedded architectures: ARM processors, mobile GPUs, and AI accelerators
- Strong fundamentals in computer architecture, memory management, and performance optimization
- Experience with version control (Git), testing frameworks, and CI/CD pipelines
- Excellent communication and collaboration skills in cross-functional teams
Preferred Qualifications
- Master's degree in Computer Science, Electrical Engineering, or related field
- Hands-on experience with large language model inference and deployment
- Experience optimizing neural networks using mixed-precision computation or dynamic quantization
- Familiarity with edge computing frameworks such as NVIDIA's Triton Inference Server or similar platforms
- Background in mobile or IoT development
- Knowledge of hardware acceleration techniques and specialized instruction sets (SIMD, NPU-specific optimizations)
- Contributions to open-source embedded AI or ML optimization projects
- Experience with real-time operating systems or embedded Linux environments
Benefits & conditions
- Compensation: Competitive salary, with a signing bonus and a retention bonus at the end of the contract.
- Flexibility: This is a hybrid role with flexible working hours. A relocation package is available if needed.
- Culture: We are a fast-scaling company committed to equal pay, diversity, and an inclusive culture. You'll gain international exposure in a multicultural, cutting-edge environment.