Senior DevOps / Platform Engineer - Runtime Validation & Governance
Role details
Job location
Tech stack
Job description
As Senior Agentic DevOps Engineer - Runtime Validation & Governance, you will design and implement the runtime validation, control, and governance layer that determines how AI-generated logic is tested, verified, and promoted into production.
This is not a traditional DevOps role. You will work on advanced deployment and validation pipelines where autonomous agents, simulations, and safety controls are first-class citizens. Your work ensures that AI-driven orchestration remains explainable, reversible, traceable, and auditable before impacting live engineering systems., * Design sandboxed and containerised execution environments for testing AI-generated logic prior to deployment
- Build and maintain multi-stage validation and release pipelines (sandbox validation gated staging monitored production)
- Define runtime health metrics, confidence thresholds, and promotion gates for autonomous agent outputs
- Develop regression and continuous verification frameworks to detect drift, instability, or unexpected behaviours in agent-driven workflows
- Implement telemetry, observability, and traceability pipelines capturing agent decisions, validation results, and anomalies
- Design snapshotting and rollback mechanisms to safely restore verified system states
- Ensure auditability, reproducibility, and governance of all agent-generated artefacts
- Collaborate closely with AI researchers, platform architects, safety engineers, and UI teams
Requirements
Do you have experience in UI?, * 5+ years of experience in DevOps, platform engineering, systems engineering, or high-assurance infrastructure
- Strong experience designing CI/CD pipelines for complex, production-grade systems
- Proven background in sandboxing, controlled deployments, validation pipelines, or safety-critical systems
- Strong programming skills in Python and C or C++
- Experience with runtime monitoring, telemetry, rollback, and failure recovery mechanisms
- Solid understanding of distributed systems and production reliability
Desired Qualifications
- Experience with AI agent frameworks (e.g. LangGraph, LangChain, or similar)
- Exposure to simulation environments, HPC systems, or digital twins
- Knowledge of regression testing strategies, differential validation, or confidence-based promotion models
- Background in autonomous systems, robotics, or AI governance
- Experience working in regulated, high-assurance, or safety-critical environments