Operations Engineer (Kubernetes/OpenShift)

THE JUDGE GROUP, INC.
Charlotte, United States of America
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 119K

Job location

Charlotte, United States of America

Tech stack

Agile Methodologies
JIRA
Bash
Cloud Computing
Cloud Computing Security
Software Debugging
Linux
Monitoring of Systems
Information Technology Operations
Python
Key Management
Openshift
Prometheus
Workflow Management Systems
Scripting (Bash/Python/Go/Ruby)
System Availability
Grafana
Software Troubleshooting
Reliability of Systems
GIT
Containerization
Kubernetes
Infrastructure Automation Frameworks
Splunk
Software Version Control
Docker

Job description

We are seeking a detail-oriented Operations Engineer to support the day-to-day operations of our Kubernetes and OpenShift platforms. In this role, you'll work hands-on with container orchestration technologies, ensure platform stability, support incident response, and execute operational procedures that keep our infrastructure secure, reliable, and compliant.

This position is ideal for engineers who enjoy working in production environments, resolving technical issues, optimizing operational workflows, and collaborating closely with platform engineering teams. What You'll Do Platform & Infrastructure Operations

  • Support routine operations across Kubernetes and OpenShift clusters, including deployments, upgrades, cluster builds, node management, and maintenance activities.
  • Develop and interpret detailed specifications for complex infrastructure systems, and contribute to the design, testing, and validation of technical solutions.

Incident & Problem Management

  • Participate in incident response, root cause analysis, and issue triage.
  • Identify and address recurring issues, contributing to long-term remediation strategies to enhance system reliability and reduce recovery time.

Monitoring & Reliability

  • Monitor platform dashboards, alerts, and logs, escalating issues based on operational standards and runbooks.
  • Collaborate to identify troublesome trends and proactively implement improvements.

Automation & Tooling

  • Contribute to operational automation using scripting languages such as Python or Bash to reduce manual processes.
  • Assist with the development and testing of automation that improves platform efficiency and consistency.

Security, Compliance & Controls

  • Review and analyze solutions related to cloud security, secrets management, and key rotation processes.
  • Follow established change control practices, operational guidelines, and security policies.
  • Direct daily risk and control flows, ensuring adherence to work standards and procedures.

Collaboration & Agile Delivery

  • Work closely with peers, senior engineers, and cross-functional partners to resolve issues and achieve operational goals.
  • Participate in Agile development practices, including coding, testing, debugging, and documenting updates to operational tooling and processes.

Requirements

  • 4+ years of experience in Technology Infrastructure Engineering or Solutions, or equivalent (work experience, military training, or education).
  • 3+ years of experience in Systems Operations, IT Operations, or Platform Support.
  • 2+ years of hands-on experience with Kubernetes, OpenShift, or other container orchestration platforms.
  • 2+ years of experience with Linux system operations.
  • 2+ years of experience with containerization tools such as Docker.

Desired Qualifications

  • Strong interest in cloud operations, platform reliability, and infrastructure automation.
  • Exposure to monitoring and observability tools (e.g., Grafana, Prometheus, Splunk).
  • Experience with version control and workflow tools such as Git and Jira.
  • Familiarity with automation scripting (Python, Bash, or similar).
  • Strong troubleshooting, analytical, and documentation skills.

Apply for this position