AI DevOps Engineer
Role details
Job location
Tech stack
Job description
The mobile team ships the Roku Remote, Smart Home, and Howdy apps on iOS and Android. You own the CI/CD pipeline and QA automation infrastructure for the mobile engineering team. Your role will use AI to design and build a completely autonomous, self-healing CI/CD and QA automation pipeline for multiple products with millions of users. You treat AI as your primary design tool not an add-on. Every system you build should minimize human intervention, from code push to app store submission. You'll own CI Pipeline Architecture: the path from git push to a green or red signal. Your job is to make that path fast, reliable, and cheap. That includes QA Automation & Device Orchestration: the software systems that schedule, monitor, and recover the test infrastructure., * Design and maintain CI/CD pipelines for iOS and Android on GitLab CI
- Architect pipeline stages for fail-fast execution: cheapest checks first (lint, compile, static analysis), expensive checks last (device farm tests)
- Build smart test routing: analyze MR diffs to determine which tests need physical devices and which can run on emulators, so 80% of MRs never touch the device farm
- Build flaky test detection and quarantine systems. Classify failures as infrastructure-caused vs. code-caused so engineers trust the signal
- Automate release mechanics: code signing, versioning, TestFlight/Play Console uploads, dSYM and mapping file management. The goal is zero manual steps between merge and app store submission
- As agent-authored MR volume grows, ensure pipelines absorb the increase without degrading speed or starving human-authored MRs of resources
- Build the device reservation and orchestration system that assigns devices to CI jobs, prevents contention, and maximizes utilization without manual scheduling
- Design self-healing automation: health checks detect unresponsive devices, trigger remote recovery via API, and re-register them no human intervention required
- Define the device compatibility matrix which firmware/model combinations require real hardware, and which can run on emulators
- Implement priority-based test routing: device-touching MRs get farm time, UI-only MRs never queue for a device
- Use AI to identify failure patterns, predict infrastructure issues, and continuously optimize pipeline performance
Requirements
- 5+ year's operating CI/CD infrastructure at scale, preferably GitLab CI
- Ability to travel up to 20%
- Deep understanding of mobile build systems (Xcode/xcodebuild, Gradle) and mobile-specific CI challenges (code signing, provisioning, multi-platform builds)
- Strong scripting (Python, Bash) and ability to build internal tooling reservation systems, health monitors, pipeline analytics dashboards
- Advanced proficiency with AI-assisted development (Copilot, Claude Code, Cursor, or equivalent) you use AI as your default approach to writing code, building systems, and solving infrastructure problems
- Experience designing autonomous, self-healing systems that detect, diagnose, and recover from failures without human intervention
- AI-first problem solving where your instinct is to automate with AI before adding manual process or headcount
- Obsession with developer experience.You measure your success by how fast and reliably engineers get feedback, not by how complex your infrastructure is
- Data-driven decision making. You measure failure rates, waste rates, device utilization, and pipeline duration and you use those numbers to prioritize your work, * Experience with infrastructure-as-code (Terraform, Ansible, or equivalent) for managing cloud and on-premises infrastructure
- Working knowledge of WiFi and BLE protocols enough to understand why tests that exercise radio communication behaves differently from pure software tests
- Experience with mobile test automation frameworks (XCUITest, Espresso, Appium) not to write tests, but to understand what they need from infrastructure
- Experience scaling CI for high-volume, automated code generation (agentic engineering, bot-authored MRs)
Benefits & conditions
Roku is committed to offering a diverse range of benefits as part of our compensation package to support our employees and their families. Our comprehensive benefits include global access to mental health and financial wellness support and resources. Local benefits include statutory and voluntary benefits which may include healthcare (medical, dental, and vision), life, accident, disability, commuter, and retirement options (401(k)/pension). Employees are supported in taking time off, in accordance with local leave policies and other personal needs to support their evolving work and life needs. It's important to note that not every benefit is available in all locations or for every role. For details specific to your location, please consult with your recruiter.