Cloud Reliability Test Engineer

Apex Systems LLC
27 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Tech stack

Amazon Web Services (AWS)
Bash
Cloud Computing
Cloud Engineering
Code Coverage
DevOps
Python
Load Testing
Software Reliability Testing
Cloud Services
Datadog
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Grafana
Mttr
Kubernetes
Terraform
Splunk
Appdynamics

Job description

As a Cloud Reliability Test Engineer, you will own the enterprise reliability testing strategy and governance across cloud services, setting standards and release criteria aligned to SLAs/SLOs and define a multi-year roadmap for resiliency, performance, and observability. You will establish organization-wide benchmarks and guardrails to ensure consistent test coverage and deployment gates, align cross-functional leaders on reliability goals and risk management for critical user journeys, and provide executive KPIs and scorecards that drive accountability for availability, latency, MTTR, and error budget adherence.

This role balances hands-on technical work (20%) with strategic leadership (80%). You will oversee incident readiness to ensure corrective actions deliver measurable improvements, evaluate and standardize tooling and reference architectures, and lead enablement and maturity uplift through playbooks, training, and a community of practice, influencing quarterly and annual planning with reliability posture, risk assessments, and ROI on reliability investments.

Initially, you will focus on establishing chaos testing capabilities, challenging cloud architecture designs, and mentoring QA teams in advanced testing practices. As these foundations mature, you will expand to own the enterprise reliability testing strategy, setting organization-wide standards, benchmarks, and release criteria aligned to SLAs/SLOs.

Requirements

TITLE: Cloud Reliability Test Engineer (Senior)

LEVEL: Senior (Mid-level+ to Senior range acceptable)

TYPE: Individual contributor, strategic leadership role

REPORTS TO: QA Director

5+ years QA experience AND 3+ years cloud/DevOps

Hands-on Terraform and Kubernetes

Public cloud (AWS or Google Cloud Platform) + on-prem exposure

Performance/load testing experience

Observability tools (Splunk, Datadog, AppDynamics)

Scripting (Python, Go, or Bash)

Strong communication and mentoring ability

About the company

Apex Systems is a world-class IT services company that serves thousands of clients across the globe. When you join Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRated's Best of Staffing in Talent Satisfaction in the United States and Great Place to Work in the United Kingdom and Mexico. Apex uses a virtual recruiter as part of the application process. Click for more details.

Apply for this position