Kubernetes Security Engineer

OpenKyber LLC
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote

Tech stack

Kubernetes Security
API
Artificial Intelligence
Amazon Web Services (AWS)
Azure
Cloud Computing
Software Debugging
Distributed Systems
Fault Tolerance
Python
Data Streaming
Systems Integration
TypeScript
Google Cloud Platform
Large Language Models
Caching
Backend
Kubernetes
Docker

Job description

Built multi-step agentic workflows with tool use and function calling Experience with agent orchestration frameworks (LangGraph, CrewAI, or custom) Built guardrails, fallbacks, or graceful degradation for AI systems Streaming inference and async agent orchestration Cost/latency optimization: caching, batching, prompt compression ML observability tools: Langfuse, Arize, Braintrust, W&B Retrieval systems (vector search, hybrid search) as a tool, not the focus Screening Questions for Candidates

  • "Describe a production AI agent or skill system you built. What broke and how did you fix it?"

  • "Have you built MCP servers/integrations or custom tool-use systems for LLMs?"

  • "How do you evaluate whether an LLM-based feature is working well? What makes this hard?"

  • "Walk me through how you'd deploy and scale an AI service on Kubernetes."

Not a Fit

If Primarily a model trainer/fine-tuner (we're not training models) AI experience is mainly academic, research, or tutorial-based No production systems experience (only notebooks/demos) Looking for entry-level role with heavy mentorship Background is primarily data science/analytics rather than engineering "Architects" who don't write or deploy code themselves

For applications and inquiries, contact: hirings@openkyber.com

Requirements

Do you have experience in Scalable systems?, Requirement Details Backend/Systems Experience 3+ years building production backend or distributed systems (pre-AI experience required) Production AI Systems Has shipped AI/LLM features serving real users at scale not just prototypes or demos Agentic Systems Has built AI agents, skills, tools, or MCP (Model Context Protocol) integrations Python Proficient for backend development Secondary Language Working knowledge of Go, TypeScript, or Rust Cloud Infrastructure Deep experience with AWS/Google Cloud Platform/Azure cost optimization, compute decisions, not just deployment Container & Orchestration Hands-on with Docker and Kubernetes can build, deploy, debug, and scale services themselves LLM Integration Understands token economics, context limits, rate limiting, structured outputs, API failure modes LLM Evaluation Understands how to evaluate LLM outputs and the inherent challenges (non-determinism, quality measurement, regression detection) Hands-On Engineer Not just an architect writes code, debugs production issues, deploys their own work Preferred / Differentiators

Apply for this position