Senior Cloud Engineer
Role details
Job location
Tech stack
Job description
We are seeking a Senior DevOps Engineer to lead the technical implementation of our Azure Enterprise Landing Zones and AI-ready infrastructure. You will bridge the gap between core cloud architecture and MLOps, ensuring that our AI/ML workloads-from Azure OpenAI to custom models-are deployed onto a secure, high-performance, and fully automated foundation.
- Azure Architecture & Landing Zones
Landing Zone Implementation: Deploy and manage scalable Azure Landing Zones, ensuring enterprise-grade governance, subscription organization, and resource hierarchy.
Networking & Security: Architect secure Azure Networking (VNet, Peerings, Private Links, Hub-and-Spoke) and implement robust security guardrails via Azure Policy and Azure Active Directory (Entra ID).
- Containerization & Orchestration
AKS & Kubernetes: Act as the subject matter expert for Azure Kubernetes Service (AKS), managing cluster lifecycles, namespaces, and pod security policies.
Docker Expert: Build, optimize, and secure Docker images for microservices and AI model serving.
Helm Mastery: Utilize Helm Charts for consistent, version-controlled application deployments.
- Infrastructure as Code (IaC) & Automation
Terraform Mastery: Develop and maintain modular, enterprise-scale Terraform code to ensure & quot;Everything as Code" for both IaaS (VMs, Network) and PaaS (APIM, Event Hubs).
CI/CD Governance: Build and optimize sophisticated pipelines using Azure DevOps and GitHub Actions, integrating security scanning and automated testing.
- AI & MLOps Integration
AI Workloads: Provision and scale infrastructure for Azure Machine Learning and OpenAI services, specifically managing GPU node pools and model monitoring.
MLOps Pipelines: Implement deployment workflows for AI models, focusing on model performance tracking and automated drift detection.
- Observability & Operations
Monitoring: Lead environmental instrumentation using Azure Monitor, Log Analytics, and Application Insights.
FinOps: Monitor and optimize cloud spend with custom cost-tracking and alerting for high-compute AI resources.
Requirements
- 6+ Years in DevOps/Cloud: Deep experience with Azure IaaS and PaaS.
- IaC Specialist: Advanced proficiency in Terraform for multi-region deployments.
- K8s Expert: Hands-on experience with Docker, Kubernetes (AKS), and ingress controllers.
- Automation Lead: Expert in Azure DevOps and/or GitHub Actions for CI/CD.
- Networking Guru: Strong understanding of Azure VNet, Firewall, and Load Balancing.
- AI Aware: Exposure to deploying and managing AI/ML workloads on Azure.