Video SRE
Role details
Job location
Tech stack
Requirements
Bachelor's degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.\n5+ years of experience in Site Reliability Engineering, DevOps, or Systems Engineering with demonstrated senior-level impact.\nProduction ownership at scale including on-call/incident response, post incident reviews and driving operational improvements.\nStrong understanding of Linux fundamentals and networking principles, with experience operating and debugging production systems.\nProficiency in at least one programming language (Shell, Python, Go, or similar) to reduce toil, build SRE tooling, and improve operability.\nHands-on experience with cloud infrastructure and container orchestration.\nExcellent troubleshooting and root-cause analysis skills across the full technology stack.\nEffective communicator who can collaborate with cross-functional partners to drive reliability outcomes.
Thorough understanding of distributed systems fundamentals, failure modes, and resilience patterns that prevent cascading outages.\nTrack record of building and continuously improving observability (metrics/logs/traces), alert quality, and incident response processes for complex, high-traffic environments.\nHands-on performance optimization, capacity planning, and reliability engineering (load testing, bottleneck analysis, degradation strategies).\nProven ability to build and operate Infrastructure as Code and CI/CD pipelines, including safe deployment practices and change risk controls.\nExperience debugging and operating JVM-based applications in production (e.g., understanding of thread analysis, heap profiling).\nWorking knowledge of database systems, key-value stores, caching layers, message queues, and storage infrastructure at scale.\nFamiliarity with video streaming technologies, codecs, protocols, and media delivery infrastructure.