Site Reliability Engineer in Chicago

Energy Jobline

Chicago, United States of America

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Chicago, United States of America

Tech stack

Java

ActiveMQ

Amazon Web Services (AWS)

Build Automation

Bash

Cloud Engineering

Code Review

Computer Programming

Continuous Integration

Distributed Systems

Python

Scrum

RabbitMQ

Reliability Engineering

Site Reliability Engineering Practices

Prometheus

Datadog

Scripting (Bash/Python/Go/Ruby)

Grafana

Kubernetes

Rancher

Kafka

Splunk

Appdynamics

Docker

Jenkins

Job description

We're looking for a Site Reliability Engineer to support the availability, performance, and reliability of a next- cloud- platform. You'll collaborate across engineering and infrastructure teams, build automation to reduce toil, improve incident response, and strengthen system resilience through monitoring, metrics, and modern SRE practices.

What You'll Do

Partner with development, operations, and infrastructure teams to ensure service availability
Build automation to improve incident response and prevent recurring issues
Create and enhance runbooks for outages and service degradations
Assess production readiness and reliability of new and existing services
Define and track operational metrics for performance, scalability, and availability
Architect and maintain shared tools that improve reliability across teams
Contribute to continuous improvement through research, retrospectives, and code reviews
Influence timelines, expectations, and technical direction within the team
Mentor junior engineers and help shape sprint planning

Requirements

Expert in Building Kubernetes Clusters from scratch
Experience supporting and troubleshooting large-scale distributed systems
Strong documentation, communication, and analytical problem-solving skills
Comfortable working in fast-paced, rapidly changing environments

Technical Skills:

Hands-on experience managing cloud infrastructure (AWS)
Analysis using tools like Splunk, AppDynamics, Datadog, Prometheus, Grafana
Programming/scripting in Java, Python, Bash, or Go
Experience with distributed messaging (Kafka, RabbitMQ, ActiveMQ)
Container orchestration (Kubernetes, Docker, Rancher)
CI/CD tools such as Jenkins, Travis, and Harness

Benefits & conditions

15% Bonus
20+ days PTO
Health, Vison, Dental
6% match 401k
Technology Stipend
Tuition/Training reimbursement program

About the company

Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub. We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy and engineering jobs, and work with the leading energy companies worldwide. We focus on the Oil & Gas, Renewables, Engineering, Power, and Nuclear markets as well as emerging technologies in EV, Battery, and Fusion. We are committed to ensuring that we offer the most exciting career opportunities from around the world for our jobseekers.