Senior Software Engineer - Robotics, Distributed Systems & ML Infrastructure
Role details
Job location
Tech stack
Job description
As a Senior Software Engineer in our Autonomy & Learning team, you will build the software foundations that enable next-generation robot autonomy at scale. You will work across robot middleware (ROS 2), distributed systems, cloud infrastructure, and ML data pipelines to create reliable, high-performance components that power robotic learning, deployment, and real-time operation.
This role blends deep engineering craftsmanship with systems-level thinking. You will own critical architectural decisions, collaborate closely with autonomy, controls, and ML teams, and help shape the technical backbone of RobCo's next-generation robotic platform.
Your Responsibilities
- Build autonomy platform components - Design and implement high-quality services and modules in a ROS 2-based robotics system with tight latency constraints and high quality of service.
- Develop distributed robotic systems - Architect control, perception, and telemetry pipelines that integrate tightly with real robot hardware.
- Drive ML data pipelines - Develop ingestion, preprocessing, and storage pipelines for multimodal datasets; support large-scale training workflows.
- Cloud & distributed infrastructure - Build on top of our scalable cloud-native systems (AWS) including data flows, EC2 orchestration, containerized services, and compute clusters.
- Enable scalable robot learning - Integrate technologies such as Ray/Anyscale for distributed training, simulation, rollout generation, and model evaluation.
- Deliver engineering excellence - Lead code reviews, testing strategies, CI/CD, observability, and documentation standards.
- Collaborate cross-functionally - Work with autonomy, controls, and ML teams to define system interfaces and ensure seamless integration.
- Mentor & lead - Provide technical guidance, make architectural decisions, and elevate the engineering culture.
Requirements
Do you have experience in System design?, * 5-10+ years of experience in software engineering, distributed systems, or robotics platforms
- Aptitude for dealing with and optimizing performance-critical systems and algorithms
- Strong proficiency in C++ and Python, with clean, maintainable engineering practices
- Deep experience with ROS 2 and Zenoh (nodes, messaging, lifecycle, middleware, performance, real-time systems)
- Hands-on experience building distributed systems, including messaging, compute orchestration, and storage
- Strong knowledge of Docker, container runtimes, and cloud environments (AWS preferred)
- Experience with PyTorch or ML toolchains and familiarity with data workflows (Ray, Spark, or similar)
- Solid system design skills and ability to own complex architectural components end-to-end
- Excellent collaboration skills and ability to work across autonomy, ML, and robotics engineering domains
- Experience with front-end development a plus