Lead Cloud Platform Engineer / DevOps & SRE
Role details
Job location
Tech stack
Job description
- We are seeking a highly experienced Lead Cloud Platform Engineer with deep expertise across AWS, Google Cloud Platform (GCP), and Azure to design, build, and operate scalable, secure, and highly available cloud platforms. This role will lead cloud modernization initiatives, drive DevOps and SRE best practices, and partner closely with engineering, security, and architecture teams to enable high-performing, resilient systems.
- The ideal candidate brings hands-on technical leadership in cloud infrastructure, CI/CD automation, container platforms (Kubernetes/OpenShift), and enterprise data platforms, along with strong experience migrating legacy systems to cloud-native architectures., Cloud Platform & Architecture
- Design and implement multi-cloud architectures across AWS, GCP, and Azure with a focus on scalability, availability, security, and cost optimization
- Lead application and platform migrations from legacy and PaaS platforms (e.g., PCF) to Kubernetes/OpenShift
- Architect and provision cloud infrastructure for high-volume, high-velocity OLTP and data-intensive workloads
- Define and enforce cloud governance, IAM, RBAC, network security, and compliance standards
DevOps, CI/CD & Automation
- Build, optimize, and maintain CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI, and related tools
- Automate build, test, security scanning, and deployment workflows for containerized and non-containerized workloads
- Implement Infrastructure as Code (IaC) using Terraform and related tooling
- Enable blue-green and zero-downtime deployment strategies across environments
Kubernetes & Container Platforms
- Act as SME for Kubernetes and OpenShift platforms
- Configure namespaces/projects, quotas, RBAC, ingress controllers, routing, and multi-tenancy
- Integrate secrets management (e.g., HashiCorp Vault) and container image scanning into CI/CD pipelines
- Partner with development teams to onboard applications and promote container best practices
Site Reliability Engineering (SRE) & Observability
- Implement monitoring and alerting solutions using Prometheus, Grafana, Splunk, and related tools
- Define and track SLIs/SLOs (uptime, latency, throughput, error rates)
- Proactively identify performance, scalability, and reliability issues and drive remediation
- Support production releases and post-deployment monitoring
Data & Enterprise Architecture
- Lead data architecture and platform engineering initiatives, including ETL pipelines and analytics platforms
- Establish best practices for data governance, data quality, metadata management, security, privacy, and DLP
- Support enterprise integration patterns, master data management, and regulatory compliance initiatives
Leadership & Collaboration
- Provide technical leadership and mentorship to DevOps, SRE, and platform engineering teams
- Collaborate cross-functionally with application engineering, security, networking, QA, and enterprise architecture
- Contribute to product roadmaps, platform strategy, and long-term cloud modernization plans
Requirements
- 10+ years of experience in cloud, DevOps, platform, or infrastructure engineering
- Strong hands-on expertise with AWS, GCP, and Azure
- Advanced experience with Kubernetes and OpenShift in enterprise environments
- Proven experience building and operating CI/CD pipelines at scale
- Strong proficiency in Linux/Unix systems, networking (TCP/IP, DNS), and distributed systems
- Experience with Terraform and Infrastructure as Code
- Solid scripting and automation skills using Python and Shell
- Deep understanding of security, IAM, compliance, and governance in cloud environments, * Experience leading large-scale enterprise cloud migrations
- Strong background in data platforms (BigQuery, Databricks, Spark, Airflow, dbt)
- Experience with monitoring, observability, and SRE practices
- Prior experience in financial services, fintech, retail, or regulated industries
- Agile/Scrum experience and strong stakeholder communication skills
Tools & Technologies
- Cloud: AWS, GCP, Azure
- Containers & Platforms: Kubernetes, OpenShift
- CI/CD & DevOps: Jenkins, GitHub Actions, Git, Terraform, Harness
- Monitoring & Logging: Prometheus, Grafana, Splunk
- Data & Analytics: Databricks, Spark, BigQuery, Airflow, dbt
- Programming/Scripting: Python, Shell
- Visualization & BI: Tableau, Power BI
Benefits & conditions
-
$27.75 per hour Logistics at full potential. At GXO, we're constantly looking for talented individuals at all levels who can deliver the caliber of service our company requires. You know that a …
-
17 days ago
Lead - Day Shift GXO Logistics
-
Hayward, CA
-
$27.75 per hour Logistics at full potential. At GXO, we're constantly looking for talented individuals at all levels who can deliver the caliber of service our company requires. You know that a …
-
17 days ago
Lead - Day Shift GXO Logistics
-
Union City, CA
-
$27.75 per hour Logistics at full potential. At GXO, we're constantly looking for talented individuals at all levels who can deliver the caliber of service our company requires. You know that a …
-
17 days ago