Senior Site Reliability Engineer
Role details
Job location
Tech stack
Job description
Design, build, and operate highly reliable, scalable cloud infrastructure and services Define and implement SRE best practices, including SLIs/SLOs, monitoring, alerting, and incident response Improve system observability through logging, metrics, and distributed tracing Lead efforts around infrastructure as code (Terraform preferred), automation, and CI/CD pipelines Strengthen system resilience through fault tolerance, redundancy, and capacity planning Drive incident management processes, including root cause analysis and postmortems
Backend Development & Collaboration Contribute to backend development, particularly in areas impacting performance, scalability, and data processing Partner with backend engineers and data team to design for reliability, performance, and scale Mentor engineers and help establish a strong culture of reliability and operational excellence
Requirements
Do you have experience in Terraform?, 4+ years of experience in Site Reliability Engineering, DevOps, or Backend Engineering roles Strong experience with cloud infrastructure (AWS preferred) and distributed systems Hands-on experience with container orchestration (Kubernetes/ECS) and infrastructure as code (Terraform preferred) Proficiency in backend development (experience with Go is a plus) Backend experience with IoT device management, fleet orchestration, and device observability is a strong plus Deep understanding of system reliability, performance tuning, and scalability Experience building or maintaining observability systems (e.g., Datadog, Prometheus/Grafana, or similar) Familiarity with data systems and pipelines Proven ability to lead projects and mentor other engineers Experience working in highly regulated or security-conscious environments is a strong plus, True ownership and autonomy. This is a role for someone who wants to take ideas from concept to execution to impact. You'll have the freedom and responsibility to build, test, and scale what works. A highly collaborative, transparent culture. We operate with openness, trust, and direct communication. You'll have visibility into decisions, context behind priorities, and access to the people driving them. A results-driven, learning-oriented environment. We care deeply about outcomes, but we also value thoughtful experimentation, fast feedback loops, and continuous improvement. Equity ownership in a fast-growing company. You'll share in the upside of what we're building together.
Benefits & conditions
Pulled from the full job description
- 401(k)
- Health insurance
- Unlimited paid time off, Competitive, people-first benefits.
Subsidized healthcare coverage, 401(k). Hybrid Working Schedule three days in HQ office (River North) Unlimited PTO. We trust you to deliver results, then take real time to disconnect and recharge
A team that has each other's backs. We're a small, tight-knit group that genuinely cares about doing great work and leveling up together along the way.