Principal Support Engineer
Role details
Job location
Tech stack
Job description
The Principal Support Engineer is a senior, hands-on technical expert responsible for driving rapid resolution of complex customer issues. This role blends deep technical troubleshooting, advanced log and API analysis, Datadog monitoring expertise, and leadership of high-severity escalations. The engineer will collaborate tightly with Engineering, Product, and external vendors to eliminate root causes, improve MTTR, and elevate the overall customer experience.
This is a high-visibility individual contributor role requiring strong ownership, urgency, and the ability to translate complex technical problems into clear, actionable paths forward for both customers and internal stakeholders. Key Responsibilities
- Perform hands-on technical troubleshooting using Datadog (logs, traces, dashboards), API tools (Postman/cURL), and distributed log tracing.
- Lead high-severity and strategic customer escalations, providing authoritative technical direction and timely communication.
- Drive vendor ticket escalations, ensuring SLA adherence and proactive follow-ups with Microsoft, Adobe, AWS, Cisco, and others.
- Collaborate with Engineering to deliver root-cause fixes, submit detailed technical findings, and validate permanent resolutions.
- Partner with Product to identify platform gaps, recurring customer pain points, and areas for workflow or UX improvement.
- Analyze MTTR performance, SLA trends, and operational bottlenecks; publish weekly metrics and insights.
- Develop SOPs, escalation workflows, and troubleshooting guides that improve global support operations.
- Identify automation opportunities and collaborate with internal teams to enhance Zendesk workflows and self-service deflection.
Requirements
- 10+ years in Technical Support Engineering, Escalations, SRE, or related roles.
- Expertise with Datadog (log search, traces, monitors, dashboards).
- Strong REST API troubleshooting using Postman, cURL, authentication flows, and JSON payload analysis.
- Experience diagnosing distributed systems, integrations, and SaaS platform behavior.
- Proven ability to interface with strategic enterprise customers and communicate complex technical issues clearly.
- Hands-on experience with vendor escalation processes and SLA governance.
- Strong working knowledge of MTTR, incident management, and technical support KPIs.
- Familiarity with Zendesk or similar ticketing platforms.
Performance Expectations & KPIs Resolve
- 90% of escalated tickets within SLA.
- Engage vendors within 15 minutes of SLA risk detection.
- Drive MTTR improvements of 40-50% for assigned ticket categories. Maintain CSAT
- 90% for escalated interactions.
- Identify and support permanent fixes for at least 2 recurring root-cause issues per quarter.
- Deliver 3-5 workflow, automation, or SOP improvements each quarter.