Cloud Site Reliability Engineer (SRE)
Role details
Job location
Tech stack
Job description
We are seeking a highly skilled Cloud Site Reliability Engineer (SRE) with deep expertise in Infrastructure as Code (IaC) and large scale VMware environments to play a critical role in the enterprise adoption of VMware Cloud Foundation (VCF). This role focuses on designing, automating, and operating a scalable internal cloud platform built on VCF 9.x, vSphere, NSX T, Aria Suite, and the OpenShift Container Platform (OCP) on vSphere. The Cloud SRE will help establish a fully automated, repeatable, and secure software-defined environment that can be consistently deployed across teams, enabling modern application delivery and accelerating cloud standardization across the organization., * Design, automate, and operate a scalable internal cloud platform built on VCF, vSphere, NSX T, Aria Suite, and OpenShift
- Establish a fully automated, repeatable, and secure software-defined environment
- Enable modern application delivery and accelerate cloud standardization across the organization
- Play a critical role in the enterprise adoption of VMware Cloud Foundation (VCF)
Requirements
- 7+ years of hands-on experience with large-scale VMware environments, including VMware Cloud Foundation (VCF) 9.x, vSphere, NSX T, and Aria Suite (vRealize)
- Proven experience building and operating OpenShift Container Platform (OCP) on vSphere
- Experience integrating IaC into CI/CD pipelines and automated workflows
- Deep understanding of cloud platform operations, monitoring, and troubleshooting
- Demonstrated ability to work in Agile environments and collaborate across engineering, operations, and development teams
- Experience with RedHat OpenShift, Python, and Ansible
Desired skills:
- Understanding of ITIL processes is a plus
- Hands on Cohesity / Netbackup / Avamar product experience
- Version Control (Git/Bitbucket)
- 5+ years of software development experience with some exposure to network programming and networking protocols
- Solid understanding of network concepts, load balancers, routing, subnets, and firewalls