Senior Kubernetes Engineer
Role details
Job location
Tech stack
Job description
As a Senior Kubernetes Engineer, you will collaborate with both your team and customer engineers to administer, monitor, and manage all customer Kubernetes clusters running on Amazon EKS across DTAP environments. You will ensure the clusters' stability, security, and scalability while driving automation, supporting incident response, and enabling continuous improvement of infrastructure and processes.
What will you do
Responsibilities vary, but may include:
-
Collaborate closely with your team and customer engineering teams to ensure smooth operation and alignment across all Kubernetes-related initiatives.
-
Perform administration, maintenance, and troubleshooting of all customer Kubernetes clusters deployed on Amazon EKS across DTAP (Development, Testing, Acceptance, Production) environments.
-
Implement, monitor, and maintain cluster health, performance, and reliability, proactively identifying and resolving issues to minimise downtime.
-
Manage and execute change management processes for Kubernetes clusters, including upgrades, scaling, and configuration changes, while ensuring compliance with internal policies and best practices.
-
Develop, maintain, and optimise monitoring, alerting, and logging solutions for Kubernetes infrastructure using industry-standard tools.
-
Automate routine administrative tasks and operational workflows using scripting or Infrastructure-as-Code (IaC) solutions.
-
Maintain comprehensive documentation for cluster setups, architectures, processes, and change histories, ensuring knowledge sharing within the team.
-
Enforce security best practices, ensuring clusters are compliant with organisational standards and external regulations.
Requirements
-
Minimum of 2 years of hands-on experience managing Kubernetes cluster lifecycles, including cluster provisioning, upgrades, vertical and horizontal auto-scaling, high availability setups, container networking configurations, security best practices, and disaster recovery strategies.
-
Proven experience operating Kubernetes in production environments on public clouds (especially AWS), covering services such as IAM (including RBAC), VPC and subnet networking, multi-account organisation, and observability implementations.
-
Strong proficiency with Infrastructure as Code (IaC), Git and GitOps workflows, and self-hosted CI/CD pipelines.
Nice to have
-
Familiarity with tools such as Karpenter, Cilium, Kyverno, or External Secrets Operator (ESO).
-
Experience developing custom Kubernetes Operators.
-
Hands-on experience with observability tools such as Datadog or Grafana.
-
Programming and scripting proficiency in Bash, Go, or Python.