Site Reliability Engineer in Chicago

Energy Jobline
Chicago, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Chicago, United States of America

Tech stack

Java
ActiveMQ
Amazon Web Services (AWS)
Build Automation
Bash
Cloud Engineering
Code Review
Computer Programming
Continuous Integration
Distributed Systems
Python
Scrum
RabbitMQ
Reliability Engineering
Site Reliability Engineering Practices
Prometheus
Datadog
Scripting (Bash/Python/Go/Ruby)
Grafana
Kubernetes
Rancher
Kafka
Splunk
Appdynamics
Docker
Jenkins

Job description

We're looking for a Site Reliability Engineer to support the availability, performance, and reliability of a next- cloud- platform. You'll collaborate across engineering and infrastructure teams, build automation to reduce toil, improve incident response, and strengthen system resilience through monitoring, metrics, and modern SRE practices.

What You'll Do

  • Partner with development, operations, and infrastructure teams to ensure service availability

  • Build automation to improve incident response and prevent recurring issues

  • Create and enhance runbooks for outages and service degradations

  • Assess production readiness and reliability of new and existing services

  • Define and track operational metrics for performance, scalability, and availability

  • Architect and maintain shared tools that improve reliability across teams

  • Contribute to continuous improvement through research, retrospectives, and code reviews

  • Influence timelines, expectations, and technical direction within the team

  • Mentor junior engineers and help shape sprint planning

Requirements

  • Expert in Building Kubernetes Clusters from scratch

  • Experience supporting and troubleshooting large-scale distributed systems

  • Strong documentation, communication, and analytical problem-solving skills

  • Comfortable working in fast-paced, rapidly changing environments

Technical Skills:

  • Hands-on experience managing cloud infrastructure (AWS)

  • Analysis using tools like Splunk, AppDynamics, Datadog, Prometheus, Grafana

  • Programming/scripting in Java, Python, Bash, or Go

  • Experience with distributed messaging (Kafka, RabbitMQ, ActiveMQ)

  • Container orchestration (Kubernetes, Docker, Rancher)

  • CI/CD tools such as Jenkins, Travis, and Harness

Benefits & conditions

  • 15% Bonus

  • 20+ days PTO

  • Health, Vison, Dental

  • 6% match 401k

  • Technology Stipend

  • Tuition/Training reimbursement program

About the company

Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub. We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy and engineering jobs, and work with the leading energy companies worldwide. We focus on the Oil & Gas, Renewables, Engineering, Power, and Nuclear markets as well as emerging technologies in EV, Battery, and Fusion. We are committed to ensuring that we offer the most exciting career opportunities from around the world for our jobseekers.

Apply for this position