SRE - Azure Virtual Desktop - INTL India

Insight Global
Irvine, United States of America
14 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Irvine, United States of America

Tech stack

Azure
Virtual Desktops
Powershell
Reliability Engineering
Systems Integration
Data Logging
Cloud Platform System
Cloud Integration

Job description

We are seeking an experienced Site Reliability Engineer (SRE) with deep expertise supporting Azure Virtual Desktop (AVD) environments at an enterprise scale. This individual will be responsible for ensuring the reliability, performance, observability, and cost efficiency of a large AVD platform supporting 600+ users across shared compute environments., Own the reliability, availability, and performance of enterprise-scale Azure Virtual Desktop (AVD) environments supporting 600+ users.

Deploy, maintain, and optimize shared pooled compute models.

Proactively monitor, troubleshoot, and resolve AVD platform issues across compute, networking, storage, and identity.

Implement observability (monitoring, logging, alerting) and lead incident response and root-cause analysis.

Drive automation and continuous improvement to reduce manual effort and increase stability.

Manage AVD cost optimization, balancing performance, reliability, and cloud spend.

Support FSLogix profile management and integrations with Zscaler and enterprise security standards.

Perform ongoing platform maintenance, patching, and lifecycle management.

Requirements

5+ years of experience in an SRE role supporting Azure Virtual Desktop in large-scale, enterprise environments.

Proven experience managing environments with large-scale users across shared compute models.

Hands-on experience with FSLogix (profiles, storage performance, troubleshooting).

Experience with Zscaler cloud integrations in enterprise desktop or cloud environments.

Strong troubleshooting experience in Azure, including compute, networking, storage, and identity.

Proficiency with PowerShell and scripting for automation and operational efficiency.

Experience implementing monitoring, logging, and alerting in Azure-based environments.

Strong understanding of reliability engineering principles, operational excellence, and incident management.

Apply for this position