Platform Engineer III
Role details
Job location
Tech stack
Job description
This engineer will operate at the intersection of cloud infrastructure, security, and operational excellence. We are looking for a senior individual contributor with deep Azure and Kubernetes experience who is comfortable owning critical services end-to-end, participating in a 24x7 on-call rotation, and partnering with application teams to keep the platform fast, secure, and reliable., * Design, deploy, and operate AKS clusters that host production application workloads - including ingress modernization, cluster upgrades, and workload optimization.
- Own Azure infrastructure provisioning through Terraform (IaC), with strong discipline around module design, state management, peer review, and change controls.
- Build and operate GitOps-based deployment pipelines using ArgoCD, partnering with application teams to deliver safe, repeatable releases.
- Operate the platform's observability stack (Prometheus, Grafana, Loki, and alerting) and partner with application teams to drive Mean Time to Detect below 15 minutes.
- Participate in the 24x7 on-call rotation for platform services. Respond to incidents promptly, lead root cause analysis, and drive durable fixes back into the platform.
- Implement security controls in Azure (network segmentation, secrets management, audit logging, identity integration) appropriate for a regulated enterprise environment.
- Document operational runbooks, decision records, and platform standards so knowledge scales beyond any one engineer.
- Identify and reduce toil through automation - scripting, pipeline improvements, and self-service tooling.
Requirements
- 6+ years of hands-on experience architecting and operating workloads on Microsoft Azure, with deep familiarity across compute, networking, identity, storage, and security services.
- 3+ years of production experience with Azure Kubernetes Service (AKS) - workload design, ingress and service mesh patterns, cluster upgrades, autoscaling, RBAC, and troubleshooting.
- Strong Terraform skills, including module authoring, remote state, drift management, and CI-integrated plan / apply workflows.
- Production experience with ArgoCD (or comparable GitOps tooling) - Application and ApplicationSet design, sync policies, multi-cluster deployment patterns, and rollback strategies.
- Production experience with GitHub and GitHub Actions for source control and CI/CD - branching strategy, reusable workflows, secrets handling, and pipeline governance.
- Strong scripting and automation skills in Bash and Python (PowerShell and Azure CLI also useful given the Azure footprint).
- Demonstrated ability to diagnose complex production issues across the full stack - application, container, cluster, network, and cloud - and to drive a clean root cause analysis to closure.
- Willingness to participate in a 24x7 on-call rotation is a mandatory requirement of this role. Candidates must be able to acknowledge and respond to high-severity incidents promptly during their on-call shifts.
Nice to Have
- Experience operating under a formal change-management, SOC 2, SOX, or HIPAA-style control environment. Familiarity with audit evidence, peer review, and segregation-of-duties expectations.
- Experience with Auth0 (or comparable identity platforms - Okta, Entra ID B2C) in a multi-application, multi-tenant configuration.
- Hands-on experience with cloud-native observability tooling - Prometheus, Grafana, Loki, OpenTelemetry, or comparable stacks.
- Familiarity with Backstage or other internal developer platforms.
- Experience with Couchbase or other distributed NoSQL databases at production scale.
- Experience with PagerDuty as both a responder and a service configuration owner.
- Knowledge of the Palantir Foundry platform is a strong plus.
- Experience with healthcare data platforms or other regulated data environments.
Benefits & conditions
For this US-based position, the base pay range is $50,640.00 - $171,851.56 per year . Individual pay is determined by role, level, location, job-related skills, experience, and relevant education or training.
This job is eligible to participate in our annual bonus plan at a target of 10.00%
The healthcare system is always evolving - and it's up to us to use our shared expertise to find new solutions that can keep up. On our growing team you'll find the opportunity to constantly learn, collaborate across groups and explore new paths for your career.
Our associates are given the chance to contribute, think boldly and create meaningful work that makes a difference in the communities we serve around the world. We go beyond expectations in everything we do. Not only does that drive customer success and improve patient care, but that same enthusiasm is applied to giving back to the community and taking care of our team - including offering a competitive benefits package. (http://go.r1rcm.com/benefits)