Senior Cloud Services Engineer - Plex
Role details
Job location
Tech stack
Job description
We are looking for a Senior Cloud Services Engineer with a focus on Kubernetes and Automation to join our Plex Cloud Operations team. You will support the application tier both in our private and public cloud data centers. You will maintain and assist scaling our Kubernetes-based platform to ensure high availability, security, and performance. You will work closely with platform, development, security, and infrastructure teams to automate operations and improve multi-cluster management. You will also participate in an on-call rotation to support critical operations. You will report to the Cloud Operations Manager.
Your Responsibilities:
- Maintain and improve our Kubernetes platform, ensuring high availability and scalability.
- Implement infrastructure/configuration as code to automate operations. (Terraform, Ansible, Helm, Flux, Kustomize)
- Enhance observability and logging using OpenTelemetry and Elastic.
- Building automated solutions that enable resiliency and self-healing of applications.
- Managing Server Operating Systems (Windows and Linux).
- Managing Web Servers (IIS 10).
- Troubleshoot production incidents, perform root cause analysis, and drive reliability improvements.
- Evaluate and implement cloud-native technologies to enhance platform efficiency.
- Collaborate with security teams to ensure best practices for container security and compliance.
- Work with multi-cluster management solutions such as Rancher, Cluster API (CAPI), or other Kubernetes fleet management tools.
- Manage Kubernetes infrastructure on Azure and vSphere.
- Participate in an on-call rotation to support platform operations and respond to incidents.
Requirements
- Bachelor's Degree or equivalent years of relevant work experience.
- Legal authorization to work in the U.S. We will not sponsor individuals for employment visas, now or in the future, for this job opening.
The Preferred - You Might Also Have:
- Typically requires 5+ years of relevant professional experience in a cloud infrastructure, platform engineering, or operations role.
- 3+ years managing multi-cluster Kubernetes environments. (Rancher & Cluster API).
- Hands-on experience with Azure and vSphere as Kubernetes infrastructure providers.
- Experience with Linux administration and container runtimes (Docker, containerd)
- Solid understanding of RBAC, security policies, and secrets management in Kubernetes.
- Proficiency with Terraform and Ansible.
- Familiarity with observability tools (OpenTelemetry, Elastic, PRTG, and Dynatrace).
- Public Cloud experience (Microsoft Azure or Amazon Web Services)
- Knowledge of .Net website functionality.
- Load balancer experience (F5 LTM, Azure Load Balancer)
- Understanding of IPv4/IPv6, FTP, HTTP, SSL/TLS, HTML, XML
- The ability to participate in an on-call rotation for platform support.
- Prior experience in SRE or Platform Engineering roles.
- Degree in Computer Science or related area.
Benefits & conditions
What We Offer:
- Health Insurance including Medical, Dental and Vision
- 401k
- Paid Time off
- Parental and Caregiver Leave
- Flexible Work Schedule where you will work with your manager to enjoy a work schedule that can be flexible with your personal life.
- To learn more about our benefits package, please visit at www.raquickfind.com.
This position is part of a job family. Experience will be the determining factor for position level and compensation.