Cloud Support Engineer Lead (Azure/SRE)
Role details
Job location
Tech stack
Job description
We're hiring a Cloud Support Engineer Lead to own the reliability, observability, and operational health of business-critical applications running on Microsoft Azure. This is a hands-on leadership role - you won't be managing from a distance. You'll be in the work alongside your team, diagnosing hard problems, making judgment calls under pressure, and building the processes and standards that keep complex systems running well.
The environment is real and the stakes are real. You'll be working across APIs, distributed services, and cloud infrastructure that directly supports airline operations. When something breaks, people notice. That means we need someone who stays clear-headed when things get complicated, communicates honestly when they don't have the answer yet, and treats every incident as an opportunity to understand the system better.
You'll lead a support team while remaining a credible technical resource yourself. You'll partner closely with QA, development, architecture, and business stakeholders - which means you'll need to be as comfortable explaining a problem to a non-technical audience as you are diagnosing it in Azure Monitor. And you'll help define what good looks like for this team, not just maintain the status quo.
What you will do
- Lead the design and implementation of monitoring strategies - metrics, dashboards, and proactive alerting - across application and API health
- Analyze system behavior, observability data, and transaction flows to diagnose issues across complex, interconnected business processes
- Lead incident response: timely resolution, honest root cause analysis, and genuine continuous improvement - not just closing tickets
- Provide technical oversight and validation during deployments across development and production environments
- Coordinate daily activities of the support team including prioritization, assignment, and development of the people on it
- Establish and improve support processes, documentation standards, and operational procedures that will outlast any single incident
- Report on API performance, incident trends, and system health to both engineering and business stakeholders in language each audience can use
- Build and maintain deep domain knowledge of the applications, APIs, and Azure architecture you're supporting
- Collaborate across internal and external teams to resolve issues and close the gap between support and engineering, * Azure infrastructure
- Azure DevOps
- Automation and scripting
- Application Performance Management (APM)
- REST / SOAP APIs
- API testing tools
- Agile methodologies
- Git
Requirements
Bachelor's degree in Computer Science, Systems Engineering, Management Information Systems, or a related field with an emphasis in systems analysis, plus 5+ years in cloud infrastructure support - OR 7+ years of equivalent experience combining education, training, and hands-on work., * Microsoft Certified: Azure Fundamentals, Azure Administrator Associate, or Azure Developer Associate
- Azure Kubernetes Service (AKS) and Azure API Management
- Microsoft Power BI
- Relational databases (Oracle, SQL) and/or NoSQL (CosmosDB)
- Messaging technologies (Azure Service Bus)
- .NET, C#, Java
- TIBCO BusinessWorks
- Postman
- Dynatrace APM including DQL
- Azure Application Insights including KQL