Senior Platform Engineer
Role details
Job location
Tech stack
Job description
A key hands-on contributor within the Platform Engineering function, responsible for designing, building, and operating secure, scalable, and reusable platform capabilities that enable product and delivery teams to move fast and safely., This role focuses on cloud-native platform engineering: building self-service infrastructure, standardised deployment patterns, and reliable runtime platforms on Azure. You will work closely with software, data, and ML teams to provide guardrails, automation, and operational excellence across the full system lifecycle., Platform & Cloud Engineering
- Design, build, and operate shared platform services on Microsoft Azure, with a strong focus on AKS-based runtime platforms.
- Develop and maintain reusable infrastructure and platform modules (Terraform, Helm) that enable safe self-service by application, data, and ML teams.
- Own platform components end-to-end (build-run), including availability, performance, security, cost, and operability.
- Define and evolve platform standards and golden paths for workload onboarding, networking, identity, secrets, observability, and scaling.
Automation & Infrastructure as Code
- Implement infrastructure and platform changes using Infrastructure as Code (Terraform) and GitOps-style workflows.
- Contribute to and improve CI/CD pipelines using Azure DevOps and GitHub Actions, including automated validation, approvals, and rollbacks.
- Reduce manual operational effort through automation, self-service tooling, and opinionated defaults.
Reliability, Security & Operations
- Ensure platform services meet agreed SLOs, security controls, compliance requirements, and resilience standards.
- Implement and maintain observability using New Relic, including metrics, logs, alerts, and dashboards.
- Lead or support investigation and resolution of complex platform incidents; contribute to post-incident reviews and systemic improvements.
- Design and validate backup, recovery, and disaster recovery mechanisms for platform-managed services.
Developer, Data & ML Enablement
- Enable application teams with reliable deployment and runtime patterns (e.g. ingress, service mesh, secrets, scaling, CI/CD integration).
- Support DataOps and MLOps workloads by providing secure, scalable platforms for data pipelines, model training, and inference.
- Collaborate with data and AI teams on services such as Microsoft Fabric, data platforms, and ML lifecycle tooling.
Collaboration & Technical Leadership
- Act as a senior technical partner to software engineers, data engineers, and ML engineers.
- Provide technical input into solution and service design, ensuring operational and platform considerations are addressed early.
- Mentor and support junior and mid-level engineers, sharing platform engineering best practices.
- Contribute to continuous improvement of platform processes, documentation, and standards.
Requirements
Do you have experience in Terraform?, * Strong experience in cloud engineering, with hands-on expertise in Microsoft Azure.
- Production experience operating Azure Kubernetes Service (AKS) at scale.
- Advanced use of Terraform for infrastructure and platform provisioning (modules, state management, governance).
- Strong working knowledge of Helm for Kubernetes workload and platform configuration.
- Experience building and operating CI/CD pipelines using Azure DevOps and/or GitHub Actions.
- Solid understanding of cloud networking, identity (Azure AD / Managed Identity), private endpoints, and security controls.
- Experience designing self-service platforms or reusable infrastructure patterns.
- Strong understanding of reliability engineering, incident management, and change control in production environments.
- Practical experience with observability and monitoring, ideally with New Relic or equivalent tools.
- Ability to balance delivery speed with risk management, stability, and long-term sustainability.
- Clear communicator, able to explain technical trade-offs and risks to both technical and non-technical stakeholders.
- Comfortable working autonomously within a defined platform scope.
- Strong sense of ownership and accountability for platform outcomes.
Desirable
- Exposure to DataOps or MLOps practices, tooling, or platforms.
- Experience supporting data pipelines, lakehouse architectures, or ML workloads on cloud platforms.
- Familiarity with Microsoft Fabric, ML pipelines, model deployment, or feature stores.
Interested?
If you are passionate about leveraging technology to transform regulatory compliance and meet the qualifications outlined above, we invite you to apply. Please submit your resume detailing your relevant experience and interest in CUBE.