Site Reliability Engineer (SRE)
Role details
Job location
Tech stack
Job description
Incident Management Full Stack Development Operational Excellence Reliability Engineering Artificial Intelligence Business Transformation Configuration Management Authorization (Computing) Critical Illness Insurance Product Family Engineering Backup and Recovery System Site Reliability Engineering Infrastructure as Code (IaC) Python (Programming Language), We are seeking a highly skilled Site Reliability Engineer (SRE) - Identity Directory Services to lead the reliability, scalability, and security of enterprise identity platforms. This role serves as a Subject Matter Expert (SME) for Active Directory and modern identity services across hybrid and multi cloud environments.
You will play a critical role in ensuring availability, resilience, and operational excellence, while leading automation, vulnerability remediation, and standardization efforts within identity infrastructure., * Serve as the primary SRE and technical SME for Active Directory and enterprise identity services
- Design and operate highly available and resilient identity architectures, including backup, recovery, and disaster recovery strategies
- Partner with security and platform engineering teams to identify and remediate vulnerabilities, misconfigurations, and legacy/non-compliant technologies
- Implement SRE best practices, including:
- SLIs/SLOs and error budgets
- Proactive monitoring, logging, and alerting
- Chaos engineering and failure simulations
- Lead automation initiatives using Infrastructure-as-Code and modern configuration management approaches
- Support multi-cloud identity integrations across Azure, AWS, and GCP
- Ensure secure implementation of authentication, authorization, and secrets management
- Drive incident response efforts, including root cause analysis and post-incident remediation
- Establish and enforce engineering standards, patterns, and guardrails
- Act as a trusted advisor and mentor for engineering teams on identity reliability and security, Use of Artificial Intelligence (AI): We may use Artificial Intelligence (AI) to support parts of our hiring process, including sourcing, screening, and evaluating candidates. AI helps assess applications and qualifications, but final decisions are made by our hiring team. By applying, you acknowledge and agree that your application may be reviewed using AI tools. Related Jobs Site Reliability Engineer AWS (No C2C) TEKsystems Chandler, AZRemote Linux Grafana Terraform Dynatrace Operations Automation Databricks Observability Microsoft Azure Business Valuation Financial Services Amazon Web Services Full Stack Development Artificial Intelligence Business Transformation Critical Illness Insurance Python (Programming Language) +0 Site Reliability Engineer (SRE) TEKsystems Chandler, AZRemote CI/CD Terraform Operations Management Automation Resilience Scalability Reliability Multi-Cloud Group Policy Communication Observability Key Management Authentications Microsoft Azure Active Directory Threat Detection Incident Response Disaster Recovery Directory Service Influencing Skills Windows PowerShell Business Valuation Root Cause Analysis Amazon Web Services Incident Management Full Stack Development Operational Excellence Reliability Engineering Artificial Intelligence Business Transformation Configuration Management Authorization (Computing) Critical Illness Insurance Product Family Engineering Backup and Recovery System Site Reliability Engineering
Requirements
Terraform Operations Management Automation Resilience Scalability Reliability Multi-Cloud Group Policy Communication Observability Key Management Authentications Microsoft Azure Active Directory Threat Detection Incident Response Disaster Recovery Directory Service Influencing Skills Windows PowerShell Business Valuation Root Cause Analysis, This is a hands-on senior engineering role ideal for someone with deep technical expertise and a passion for building secure, reliable, and scalable identity systems., Primary Skills
- Active Directory (AD) Expertise
- Deep knowledge of AD architecture, replication, DNS, trusts, and Group Policy (GPO)
- Experience with AD backup, recovery, and forest/domain restoration
- Site Reliability Engineering (SRE)
- Proven experience applying SRE principles (availability, reliability, observability)
- Experience supporting Tier-0 or mission-critical infrastructure
- Identity & Access Management (IAM)
- Strong understanding of authentication, authorization, federation, and privileged access
- Cloud Platforms
- Hands-on experience with identity services in Azure, AWS, and GCP
- Automation & Scripting
- Experience with PowerShell, Python, or Bash
- Familiarity with Infrastructure-as-Code tools (e.g., Terraform)
- Security & Vulnerability Remediation
- Proven ability to detect and remediate identity-related vulnerabilities and misconfigurations
Preferred / Additional Skills
- Identity Threat Detection & Response (ITDR) frameworks
- Privileged Access Management (PAM) platforms (vaulting, session monitoring)
- Secrets and key management systems
- Policy-as-Code and compliance automation frameworks
- Experience integrating identity services with CI/CD pipelines and platform engineering teams
- Strong experience in incident management and continuous improvement
- Demonstrated ability to mentor and influence engineering teams at scale
What You'll Bring
- A strong engineering mindset with a focus on automation and reliability
- The ability to thrive in complex, large-scale environments
- Passion for improving security posture and system resilience
- Excellent collaboration and communication skills
Benefits & conditions
Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following:
- Medical, dental & vision
- Critical Illness, Accident, and Hospital
- 401(k) Retirement Plan - Pre-tax and Roth post-tax contributions available
- Life Insurance (Voluntary Life & AD&D for the employee and dependents)
- Short and long-term disability
- Health Spending Account (HSA)
- Transportation benefits
- Employee Assistance Program
- Time Off/Leave (PTO, Vacation or Sick Leave) Workplace Type