Site Reliability Engineer

EVER FORTH LLC

Charlotte, United States of America

2 days ago

Role details

Contract type

Temporary to permanent

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

$ 154K

Job location

Charlotte, United States of America

Tech stack

Artificial Intelligence

Computing Platforms

Bash

Continuous Integration

Middleware

Python

Automation of Marketing

Openshift

Powershell

Reliability Engineering

Site Reliability Engineering Practices

Ansible

Datadog

Mttr

Containerization

Infrastructure Automation Frameworks

Job description

Provide senior-level L2/L3 application and middleware production support, leading disciplined troubleshooting, recovery, and stabilization for complex, high-availability services.
Embed SRE practices into daily operations, including defining reliability signals, improving alert quality, and driving blameless post-incident learning to reduce toil.
Implement and improve observability across applications and middleware (logs, metrics, traces, dashboards) to improve detection, diagnosis, and Mean Time to Resolution (MTTR).
Design, develop, and maintain infrastructure-as-code and configuration-as-code capabilities for VM-based and container-adjacent workloads, including OpenShift (OCP).
Build and support automation for operational actions across middleware components (e.g., standardized status checks, start/stop/restart) to enable safer self-service.
Integrate operational automation with CI/CD pipelines to enable repeatable and auditable deployments.
Monitor for configuration drift, support automated compliance checks, and implement remediation patterns aligned with enterprise controls.
Develop and maintain runbooks and operational documentation for automation and platform procedures.
Participate in on-call rotations and provide operational support coverage as required, including outside of normal working hours.

Requirements

5+ years of combined Software, Systems, or Infrastructure Engineering experience supporting enterprise production environments.
Senior-level L2/L3 application and middleware production support experience.
A strong Site Reliability Engineering (SRE) mindset, with an emphasis on proactive reliability, MTTR reduction, and toil elimination.
Demonstrated experience designing and operating observability solutions, including logs, metrics, traces, dashboards, and actionable alerting.
Hands-on automation and scripting skills using tools such as Python, Bash, or PowerShell.
Experience working in VM-based and container-adjacent environments, including OpenShift (OCP).
Ability to understand application and platform architecture, identify bottlenecks, and engineer solutions., * Experience with Infrastructure-as-Code (IaC) and configuration-as-code, preferably with Ansible or similar automation platforms.
Experience with CI/CD-integrated operational automation.
Familiarity with cloud or container platforms (any provider).
Exposure to AI-assisted operations with appropriate guardrails.
Knowledge of Elastic or legacy observability tooling.
Strong cross-functional communication skills and experience operating in regulated environments.

Benefits & conditions

The anticipated pay range for this position is $69.00/hr to $74.00/hr.

About the company

Everforth Apex is a world-class IT services company that serves thousands of clients across the globe. When you join Everforth Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRated's Best of Staffing in Talent Satisfaction in the United States and Great Place to Work in the United Kingdom and Mexico. Everforth Apex uses a virtual recruiter as part of the application process. Click for more details.