Operations Engineer (Kubernetes/OpenShift)
Role details
Job location
Tech stack
Job description
We are seeking a detail-oriented Operations Engineer to support the day-to-day operations of our Kubernetes and OpenShift platforms. In this role, you'll work hands-on with container orchestration technologies, ensure platform stability, support incident response, and execute operational procedures that keep our infrastructure secure, reliable, and compliant.
This position is ideal for engineers who enjoy working in production environments, resolving technical issues, optimizing operational workflows, and collaborating closely with platform engineering teams. What You'll Do Platform & Infrastructure Operations
- Support routine operations across Kubernetes and OpenShift clusters, including deployments, upgrades, cluster builds, node management, and maintenance activities.
- Develop and interpret detailed specifications for complex infrastructure systems, and contribute to the design, testing, and validation of technical solutions.
Incident & Problem Management
- Participate in incident response, root cause analysis, and issue triage.
- Identify and address recurring issues, contributing to long-term remediation strategies to enhance system reliability and reduce recovery time.
Monitoring & Reliability
- Monitor platform dashboards, alerts, and logs, escalating issues based on operational standards and runbooks.
- Collaborate to identify troublesome trends and proactively implement improvements.
Automation & Tooling
- Contribute to operational automation using scripting languages such as Python or Bash to reduce manual processes.
- Assist with the development and testing of automation that improves platform efficiency and consistency.
Security, Compliance & Controls
- Review and analyze solutions related to cloud security, secrets management, and key rotation processes.
- Follow established change control practices, operational guidelines, and security policies.
- Direct daily risk and control flows, ensuring adherence to work standards and procedures.
Collaboration & Agile Delivery
- Work closely with peers, senior engineers, and cross-functional partners to resolve issues and achieve operational goals.
- Participate in Agile development practices, including coding, testing, debugging, and documenting updates to operational tooling and processes.
Requirements
- 4+ years of experience in Technology Infrastructure Engineering or Solutions, or equivalent (work experience, military training, or education).
- 3+ years of experience in Systems Operations, IT Operations, or Platform Support.
- 2+ years of hands-on experience with Kubernetes, OpenShift, or other container orchestration platforms.
- 2+ years of experience with Linux system operations.
- 2+ years of experience with containerization tools such as Docker.
Desired Qualifications
- Strong interest in cloud operations, platform reliability, and infrastructure automation.
- Exposure to monitoring and observability tools (e.g., Grafana, Prometheus, Splunk).
- Experience with version control and workflow tools such as Git and Jira.
- Familiarity with automation scripting (Python, Bash, or similar).
- Strong troubleshooting, analytical, and documentation skills.