Senior Site Reliability Engineer

UBIETY, LLC
Chicago, United States of America
15 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 150K

Job location

Chicago, United States of America

Tech stack

Amazon Web Services (AWS)
Cloud Computing
Cloud Engineering
Data Systems
DevOps
Distributed Systems
Fault Tolerance
Monitoring of Systems
Performance Tuning
Reliability Engineering
Prometheus
Datadog
Data Logging
Data Processing
Grafana
Reliability of Systems
Backend
Kubernetes
Terraform
Dynatrace

Job description

Design, build, and operate highly reliable, scalable cloud infrastructure and services Define and implement SRE best practices, including SLIs/SLOs, monitoring, alerting, and incident response Improve system observability through logging, metrics, and distributed tracing Lead efforts around infrastructure as code (Terraform preferred), automation, and CI/CD pipelines Strengthen system resilience through fault tolerance, redundancy, and capacity planning Drive incident management processes, including root cause analysis and postmortems

Backend Development & Collaboration Contribute to backend development, particularly in areas impacting performance, scalability, and data processing Partner with backend engineers and data team to design for reliability, performance, and scale Mentor engineers and help establish a strong culture of reliability and operational excellence

Requirements

Do you have experience in Terraform?, 4+ years of experience in Site Reliability Engineering, DevOps, or Backend Engineering roles Strong experience with cloud infrastructure (AWS preferred) and distributed systems Hands-on experience with container orchestration (Kubernetes/ECS) and infrastructure as code (Terraform preferred) Proficiency in backend development (experience with Go is a plus) Backend experience with IoT device management, fleet orchestration, and device observability is a strong plus Deep understanding of system reliability, performance tuning, and scalability Experience building or maintaining observability systems (e.g., Datadog, Prometheus/Grafana, or similar) Familiarity with data systems and pipelines Proven ability to lead projects and mentor other engineers Experience working in highly regulated or security-conscious environments is a strong plus, True ownership and autonomy. This is a role for someone who wants to take ideas from concept to execution to impact. You'll have the freedom and responsibility to build, test, and scale what works. A highly collaborative, transparent culture. We operate with openness, trust, and direct communication. You'll have visibility into decisions, context behind priorities, and access to the people driving them. A results-driven, learning-oriented environment. We care deeply about outcomes, but we also value thoughtful experimentation, fast feedback loops, and continuous improvement. Equity ownership in a fast-growing company. You'll share in the upside of what we're building together.

Benefits & conditions

Pulled from the full job description

  • 401(k)
  • Health insurance
  • Unlimited paid time off, Competitive, people-first benefits.

Subsidized healthcare coverage, 401(k). Hybrid Working Schedule three days in HQ office (River North) Unlimited PTO. We trust you to deliver results, then take real time to disconnect and recharge

A team that has each other's backs. We're a small, tight-knit group that genuinely cares about doing great work and leveling up together along the way.

Apply for this position