Kubernetes Security Engineer

OpenKyber LLC
Morton Township, United States of America
11 days ago

Role details

Contract type
Temporary contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 119K

Job location

Remote
Morton Township, United States of America

Tech stack

Kubernetes Security
API
Artificial Intelligence
Amazon Web Services (AWS)
JIRA
Azure
Bash
Cloud Computing
Databases
Linux
DevOps
Monitoring of Systems
Information Technology Operations
Python
Knowledge Management
Network Security
NoSQL
Powershell
Cloud Services
Prometheus
Runbook
SQL Databases
Datadog
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Enterprise Software Applications
Load Balancing
Cloud Platform System
Chatbots
Grafana
Apigee
Kubernetes
Kafka
Hardware Infrastructure
Splunk
ServiceNow

Requirements

Do you have experience in Technical troubleshooting support?, Description Of Services : SRE Operations Engineer The L1 SRE is the first line of defense in monitoring, triaging, and executing standardized operational tasks for all enterprise applications running on standard patterns and platforms like Kubernetes, APIs, WAF, databases, API Proxy (Gloo, APIGEE), Kafka, and Cloud (AWS/Azure/Google Cloud Platform). They will followrunbooks, leverage automation, and escalate appropriately to minimize downtime. Skills Mandatory Skills (Must-Have):

  • System & Infrastructure Monitoring Expectation: Ability to use monitoring dashboards (e.g., Grafana, Datadog, Splunk, Argos, AIOps) toidentify anomalies, follow alert workflows, and escalate when thresholds are breached.
  • Runbook Execution Expectation: Strictly follow documented steps to resolve standard incidents, escalate when stepsdo not apply or fail.
  • Incident Triage & Communication Expectation: Perform first-line triage of alerts, gather logs/metrics, categorize severity, and notify stakeholders in clear, concise language.
  • Kubernetes (Cloud or on-prem) operations knowledge Expectation: Ability to check pod status, understand logs, and verify service endpoints using kubectl and monitoring tools.
  • Scripting (Python, Bash, PowerShell) Expectation: Able to read and make small edits to scripts to automate repetitive checks.
  • Networking & Security Awareness Expectation: Understand troubleshooting (ping, netstat, curl, traceroute) and know when issues may be related to firewall, WAF, or proxy.
  • Documentation & Knowledge Capture Expectation: Accurately record steps taken during incidents, suggest runbook updates where gapsexist.

Preferred Skills (Nice-to-Have):

  • Cloud Platform Familiarity (AWS, Azure, Google Cloud Platform) Expectation: Understand basics of cloud services (VMs, load balancers, storage) and how tonavigate a cloud console.
  • Database Basics (SQL/NoSQL) Expectation: Run simple queries to validate DB connectivity and health.
  • Automation & Self-Service Mindset Expectation: Identify repetitive manual steps and propose candidates for automation.
  • Exposure to Incident Management Tools (xMatters, ServiceNow, Jira, etc.) Expectation: Comfortable working within ITSM/incident workflows.
  • AI/Chatbot-Assisted Ops (emerging skill) Expectation: Use AI assistants to search runbooks or suggest remediation steps.

Qualifications 2 5 years in IT operations, NOC, or SRE/DevOps engineer role. Kubernetes 101, Linux 101, Networking 101 Understanding of cloud-ready applications Understanding of observability tools (Prometheus, Grafana, ELK, Splunk, etc.). Strong troubleshooting mindset, ability to follow structured workflows. Eg: 5 Why?s and Fishbone

Deliverables : Monitor system health, alerts, dashboards, and logs across cloud and on-prem infrastructure. Ability to isolate functional issue with application versus platform Execute standardized runbooks for incident resolution, deployments, and routine tasks. Perform initial triage of incidents and escalate to L2/L2+ as needed to mitigate the issue to get tobypass. Document new issues, gaps in runbooks, and automation opportunities. Provide excellent communication to stakeholders during incidents. Support onboarding of new applications into the operations framework.

Apply for this position