Principal Software Engineer
Role details
Job location
Tech stack
Job description
We are building a massive-scale, multi-region platform designed to power the next generation of global real-time experiences. At the intersection of Cloud Engineering and AI Infrastructure, you will build the foundation for a platform that supports millions of concurrent users, defining how stateless jobs are executed at a scale that pushes the boundaries of standard open-source tooling.
As the orchestrator for our global inference and microservices footprint, our platform provides a "deploy and forget" experience for critical workloads. You won't just be managing clusters; you will be building the custom control plane that automates scheduling, scaling, and recovery across a hybrid-cloud environment, ensuring our infrastructure remains resilient regardless of where it runs.
You will:
- Build the Orchestration Engine: Design and develop custom Kubernetes Operators and Controllers in Go to automate the entire lifecycle of high-throughput, mission-critical stateless workloads.
- Architect Hybrid-Cloud Mobility: Create systems that enable workloads to move seamlessly between on-premise and public cloud environments, ensuring high availability and automated failover during regional outages.
- Extend the Kubernetes Control Plane: Write performant reconciliation loops and Custom Resource Definitions (CRDs) to handle complex scheduling logic and resource optimization for massive CPU and GPU-intensive fleets.
- Empower Developer Velocity: Build high-level platform abstractions and automation that allow service owners to deploy global-scale code without needing to manage the underlying container orchestration.
Requirements
- 10+ years of experience building web services using Golang or similar language.
- Experience building and operating K8âs clusters.
- Deep understanding of Kubernetes internals (control plane, reconciliation loops, scheduling, networking).
- Experience building large scale distributed systems with focus on scalability, reliability, and availability. Experience building or operating control-plane or orchestration systems (e.g., schedulers, workflow engines, or compute platforms).
- Strong knowledge of distributed systems fundamentals such as leader election, event-driven architectures, messaging/queuing, or distributed state management.
- Experience designing systems that handle multi-region orchestration, failover, disaster recovery, or large-scale reliability challenges.
- Experience with Oncall and in troubleshooting live site issues. Experience leading cross team greenfield projects.
- Bachelorâs degree in Computer Science or a related field, or equivalent experience.
- Experience writing Kubernetes Operators or custom controllers using Operator-SDK or control runtime.
Benefits & conditions
For roles that are based at our headquarters in San Mateo, CA: The starting base pay for this position is as shown below. The actual base pay is dependent upon a variety of job-related factors such as professional background, training, work experience, location, business needs and market demand. Therefore, in some circumstances, the actual salary could fall outside of this expected range. This pay range is subject to change and may be modified in the future. All full-time employees are also eligible for equity compensation and for benefits as described on this page. Annual Salary Range $345,040â$399,420 USD
Roles that are based in an office are onsite Tuesday, Wednesday, and Thursday, with optional presence on Monday and Friday (unless otherwise noted).
Roblox provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. Roblox also provides reasonable accommodations to candidates with qualifying disabilities or religious beliefs during the recruiting process.