Sr. Site Reliability Engineer

System One
Beckett Ridge, United States of America
9 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 140K

Job location

Beckett Ridge, United States of America

Tech stack

Microsoft Active Directory
IBM AIX
Amazon Web Services (AWS)
Azure
Bash
Unix
Cloud Engineering
Databases
Information Engineering
DevOps
Disaster Recovery
Github
Python
Network Troubleshooting
Microsoft SQL Server
Windows Server
Oracle Applications
Powershell
Reliability Engineering
Site Reliability Engineering Practices
Ansible
Prometheus
Software Engineering
Solaris (Operating System)
Datadog
Scripting (Bash/Python/Go/Ruby)
Snowflake
Grafana
Containerization
Kubernetes
Terraform
Devsecops
Docker
Jenkins
Go

Job description

  • Lead modernization efforts and hosting migrations across hybrid infrastructure (Azure, AWS, and on-prem).
  • Streamline provisioning using Infrastructure as Code (Terraform, Ansible, PowerShell DSC).
  • Build and enhance CI/CD pipelines (GitHub Actions, Jenkins) to enable fast, reliable delivery.
  • Implement and manage observability/monitoring platforms (e.g., Prometheus, Grafana, Datadog) and establish operational standards.
  • Operate and improve containerized workloads (Docker); apply SRE practices such as SLOs and error budgets.
  • Lead incident response, perform root-cause analysis, and reduce toil through automation, scripting, and runbooks.
  • Support DevSecOps initiatives, including secure practices, backup strategy, and disaster recovery readiness.
  • Evaluate and pilot emerging tools/technologies to keep the infrastructure stack modern, efficient, and scalable.

Requirements

Do you have experience in Terraform?, * 5+ years of experience in SRE, DevOps, or Cloud Engineering with production Azure or AWS exposure.

  • Strong Linux/Unix administration and network troubleshooting skills.
  • Hands-on expertise with Terraform and designing/operating CI/CD pipelines.
  • Scripting/programming proficiency in Python, Go, Bash, or PowerShell.
  • Production experience with Docker (Kubernetes is a strong plus).
  • Strong communication and documentation skills; comfortable owning work end-to-end in a small-team environment.
  • Ability to collaborate with application development and data engineering teams to define standards and manage change.
  • Must reside in the Greater Cincinnati Metro area (hybrid onsite requirement)., * Observability tooling experience (Prometheus, Grafana, ELK, or similar).
  • Windows Server / Active Directory administration.
  • Experience with legacy Unix (AIX/Solaris).
  • Database experience (Oracle/MS SQL) or BI platform exposure (Snowflake/Azure Fabric).
  • Relevant certifications (Azure, AWS, and/or CKA).

Apply for this position