SITE RELIABILITY ENGINEER
Role details
Job location
Tech stack
Job description
Ignite is currently seeking driven, detail-oriented site reliability engineer Ignite is currently seeking a driven, detail-oriented Site Reliability Engineer (SRE) to ensure the reliability, performance, and operational resilience of mission-critical software systems. This role focuses on defining reliability standards from the user perspective, instrumenting systems to measure performance against those standards, and building the tooling, automation, and operational processes that make systems resilient and recoverable. The SRE will work closely with development teams to improve operational quality early in the development lifecycle, ensuring systems are designed, tested, and deployed with reliability in mind. When production issues occur, the SRE will lead incident resolution, diagnose distributed system failures, and translate operational findings into long-term reliability improvements. This position can be filled in Dayton, OH, Huntsville, AL, or St. Louis, MO. Contingent on contract award.
Requirements
- Platform & Infrastructure- Kubernetes, ArgoCD/GitOps, disaster recovery, capacity planning
- Observability - OTel standards, Grafana/Perses, Tempo, Clickhouse, VictoriaMetrics
- Automation & Toil Reduction- scripting, CI/CD, runbook automation, "DevOps"
- Developer Enablement- instrumentation SDKs, SRE practice onboarding
- Data & Alerting- dashboard quality, alert design, anomaly detection, + 1-3 years of experience in Operations, Sys Admin, DevOps, or Software engineering
- Bachelor's Degree in CS, Computer Engineering, or related technical field
- US Citizenship & must have or be able to obtain a Top Secret Clearence
- Systems thinking - understanding how systems fail together, blast radius, and more
- Observability Fundamentals - not just the 3 signals, but knowing why and how to use telemetry to optimize services and engineering quality of life
- Basic software engineering - building automation & non-trivial APIs, git workflows, effectively engaging in code reviews
- Linux/networking fundamentals
- Strong Communication, Collaboration, and Organizational Skills
Preferred Qualifications: * + o SRE Certifications from The DevOps Institute, AWS Solution Architect, or similar o Hands-on experience with: Python, Go, Kubernetes, Argo CD, GitLab/GitHub, Jenkins, Docker, Locust/Gatling, Prometheus, Grafana/Perses
Security Clearance Requirements:
Must have an active TS/SCI Security Clearance or the ability to obtain one.
Education Requirements:
- Bachelor's degree in relevant discipline.