Public Cloud Senior Infrastructure Engineer
Role details
Job location
Tech stack
Job description
*Collaborate across cross-functional teams to architect, implement, and maintain a highly resilient and scalable Kubernetes environment in the cloud. *Engineer and optimise Kubernetes infrastructure to support multitenant workloads, ensuring strong isolation, resource efficiency, and operational scalability. *Implement and manage robust security controls, including OPA gatekeeper policies and fine-grained RBAC policies, to safeguard infrastructure and enforce least-privilege access across environments. *Build and manage CI/CD pipelines to enable automated testing, seamless deployment, and continuous integration across environments. *Diagnose and resolve complex system-level issues related to scalability, performance, and automation, ensuring optimal infrastructure health.
Requirements
*Extensive experience in a DevOps or Site Reliability Engineering role, ideally across both consumer and SaaS technology landscapes. *Proven expertise in deploying and maintaining production-grade Kubernetes clusters and services. *Hands-on experience with Kubernetes (k8s) and Containers in live environments. *Strong background in designing and implementing CI/CD pipelines for automated build, test, and deployment workflows. *Proficient in programming with Python, Go, and Bash for automation and tooling. *Demonstrated ability to take ownership of projects and drive them to successful delivery. *Skilled in writing and managing Infrastructure as Code (IaC) using tools such as Terraform. *Experience in curating and managing the full product lifecycle of cloud-native core services. Desirable Skills: *Hands-on experience with cloud infrastructure and services across Google Cloud Platform (GCP)/Azure. *Proficient in writing Infrastructure as Code (IaC) using Terraform, with a strong understanding of modular and reusable code practices. *Experience with Service Mesh technologies such as Istio and Anthos for managing microservices communication and observability. *Deep understanding of networking concepts in Cloud like Hybrid Connectivity, VPN, NAT, IPAM, DNS and routing.
Comprehensive knowledge of Cloud Security, Key Management Service (KMS), Public Key Infrastructure (PKI), Encryption, and the principles of least privilege.
-
Deep understanding of Linux operating systems, including system internals, networking, and performance tuning.
-
Exposure to high-throughput environments with experience implementing observability stacks. Prometheus /Dynatrace for logging and metrics, and OpenTelemetry for distributed tracing.
-
Strong security mindset with a track record of designing and implementing secure, resilient systems.
-
Excellent verbal, written, and interpersonal communication skills, with the ability to convey complex technical concepts clearly.
-
Comfortable operating in fast-paced, dynamic environments-able to adapt quickly, embrace ambiguity, and remain effective through change.
-
Understanding of shared services such as CoreDNS, cert-manager, Dynatrace, Cloudability, and Infoblox.
-
Familiarity with Aqua Security for container runtime protection.
-
Knowledge of OPA Gatekeeper for policy enforcement and tenant isolation.
-
Experience with Harness CI/CD pipelines for secure and scalable deployments.
-
Exposure to Backstage GitOps workflows for automation.
-
Hands-on experience with Anthos Config Management for GitOps-driven provisioning.
-
Understanding of Istio telemetry and observability integration.
-
Proficiency in enforcing mTLS and managing sidecar injection in Istio service mesh.
-
Experience with Istio ingress and egress gateways for secure service communication.