Site Reliability / DevOps Engineer

eClerx LLC

Raleigh, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Junior

Compensation

$ 138K

Job location

Raleigh, United States of America

Tech stack

Amazon Web Services (AWS)

Application Performance Management

Automation of Tests

Azure

Bash

Cloud Engineering

Continuous Integration

Linux

DevOps

Distributed Systems

Elasticsearch

Github

Monitoring of Systems

Python

Enterprise Messaging Systems

Powershell

Reliability Engineering

Data Streaming

Datadog

Scripting (Bash/Python/Go/Ruby)

Enterprise Software Applications

Cloud Platform System

Delivery Pipeline

Grafana

Mttr

Cloudformation

Gitlab-ci

Kubernetes

Bicep

Kafka

Terraform

Docker

Jenkins

Job description

eClerx is seeking a motivated SRE/DevOps Engineer with strong observability experience to join our growing Platform Engineering team. This team is responsible for managing cloud infrastructure, advancing DevOps practices, improving platform reliability, and supporting highly available enterprise applications.

The ideal candidate will have a deep understanding of cloud-native architectures, distributed systems, CI/CD automation, observability frameworks, and site reliability engineering principles. This individual will play a key role in improving platform resilience, operational efficiency, and system performance across a modern cloud-based technology ecosystem.

Responsibilities

Design, implement, and enhance system observability and monitoring solutions.
Monitor system performance, create incident response plans, and implement observability practices to gain deeper insights into system behavior.
Define, implement, and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
Improve platform reliability, scalability, and resiliency.
Conduct post-incident reviews and implement corrective actions to prevent recurring issues.
Partner with engineering teams to implement observability tooling and leverage telemetry data to troubleshoot and resolve incidents.
Utilize observability and event management capabilities to improve key operational metrics, including Mean Time to Detect (MTTD) and Mean Time to Restore (MTTR).
Continuously optimize infrastructure, architecture, automation, CI/CD processes, and operational workflows.
Collaborate closely with software engineers to ensure applications are designed and deployed following DevOps and reliability best practices.
Participate in a rotating on-call schedule, including support for production releases and critical incidents outside normal business hours when required.

Requirements

Do you have experience in Tooling?, * 5+ years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role.

5+ years of work experience with Public Cloud (Azure (preferred)or AWS)
3+ years of hands-on experience with observability platforms such as Datadog, Elasticsearch, Grafana, or similar solutions.
5+ years of experience with scripting languages like Python, Bash, Powershell, etc.
3+ years of experience with containerization and orchestration technologies, including Docker and Kubernetes.
2+ years of experience developing and managing CI/CD pipelines using tools such as Azure DevOps, GitLab CI/CD, GitHub Actions, Jenkins, or similar.
2+ years of experience with Infrastructure-as-Code (IaC) tools such as Terraform, Azure Bicep, AWS CloudFormation, or equivalent technologies.
1+ years of experience using site reliability and resilience testing tools such as Gremlin, Chaos Mesh, or similar platforms.
Proven experience leveraging observability best practices, end-user monitoring, application performance monitoring, and infrastructure monitoring solutions.
Experience with event streaming and messaging platforms such as Kafka or Azure Event Hubs.
Strong understanding of Linux operating systems and administration.
Preferred Qualifications

Kubernetes certification
Cloud platform certifications (Azure, AWS, or GCP).
Experience working in Azure environments and/or Azure DevOps.
Experience implementing and managing Datadog or other modern observability platforms.
Experience supporting enterprise-scale applications within financial services, capital markets, fintech, or other highly regulated industries.

In the US, the target base salary for this role is $120,000-$137,500. Compensation is based on a range of factors that include relevant experience, knowledge, skills, other job-related qualifications, and geography. We expect the majority of candidates who are offered roles at our company to fall throughout the range based on these factors

Benefits & conditions

3.63.6 out of 5 stars Raleigh, NC $120,000 - $137,500 a year - Full-time

About the company

eClerx is a leading provider of productized services, bringing together people, technology and domain expertise to amplify business results. The firm provides business process management, automation, and analytics services to a number of Fortune 2000 enterprises, including some of the world's leading financial services, communications, retail, fashion, media & entertainment, manufacturing, travel & leisure, and technology companies. Incorporated in 2000, eClerx is traded on both the Bombay and National Stock Exchanges of India. The firm employs more than 19,000 people across Australia, Canada, France, Germany, Switzerland, Egypt. India, Italy, Netherlands, Peru, Philippines, Singapore, Thailand, the UK, and the USA.

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

About the company

Apply for this position

Good distractions

Moments

Videos View all