Switching Software Customer Engineering Escalations
Role details
Job location
Tech stack
Job description
The Customer Engineering Escalation Engineer (CEE) is a senior individual contributor who leads the most complex and business-critical customer escalations, with deep expertise in switching and data center networking.
In this role, you serve as the technical lead during customer-impacting incidents, driving mitigation, coordinating cross-functional teams, and communicating clearly with internal and external stakeholders. Success requires strong networking fundamentals, sound judgment under pressure, and a disciplined, layer-by-layer approach to troubleshooting across modern data center environments.
This role will require being on-site in office 2+ days a week., 1. Escalation Ownership & Triage
- Own end-to-end technical execution for assigned customer escalations, from intake through stabilization and resolution.
- Perform rapid severity assessment, impact analysis, and technical triage to determine urgency and scope.
- Identify whether issues represent product defects, environmental interactions, configuration risks, or operational failures, and route appropriately.
- Drive clarity and momentum in ambiguous, fast-moving situations.
- Coordinate internal teams and partners to accelerate investigation and resolution.
- Navigate complex platforms end-to-end and drive targeted technical analysis to accelerate resolution.
- Deep Technical Investigation
- Analyze logs, packet captures, configs, and telemetry to isolate root cause, with emphasis on switching/data-center fabric troubleshooting (physical/link through Layer-2/Layer-3 forwarding and control-plane behavior).
- Reproduce issues where possible and validate hypotheses with targeted tests across interfaces, bridging domains, routing adjacencies, and fabric/overlay paths.
- Separate symptoms from root cause under pressure, including issues involving overlays, multicast traffic patterns, and switch port security/segmentation controls.
- Work effectively in single-vendor and multi-vendor environments.
- Understand product design and quality practices (design docs, test plans, QA signals, and product lifecycle).
- Customer Stabilization & Risk Mitigation
- Define and communicate workarounds, mitigations, or containment strategies to reduce customer impact while root cause analysis proceeds.
- Make risk-based recommendations when no perfect option exists, including rollback or partial remediation strategies.
- Own outcomes and follow through on commitments, including after-hours decision-making when required.
- Cross-Functional Leadership
- Lead coordination across TAC, Engineering (R&D), QA, Product Management, Sales, and Executive stakeholders.
- Maintain neutral, evidence-driven leadership when narratives conflict across vendors or teams.
- Ensure the correct level of urgency, visibility, and accountability is applied to each escalation.
- Communication & Executive Presence
- Communicate complex technical issues clearly to varied audiences, including senior customer leadership and internal executives.
- Provide concise, accurate status updates, escalation summaries, and recommendations without speculation or defensiveness.
- Represent the company with professionalism and confidence during high-pressure discussions.
- Documentation & Continuous Improvement
- Produce high-quality written artifacts including escalation summaries, root cause analyses, and post-mortems.
- Identify recurring themes and feed actionable insights back to engineering and product teams to reduce future escalations.
- Contribute to knowledge bases, templates, and process improvements within the escalation function.
- Improve escalation workflows by adopting and integrating new tools, automation, and technical approaches., * Customer impact is stabilized quickly through clear technical leadership, effective triage, and decisive mitigation.
- Escalations are managed with clarity, trust, and technical credibility across customers, partners, and internal stakeholders.
- Engineering effort is focused on the right problems with the right data, enabling faster root cause analysis and resolution.
- Lessons learned are captured and translated into sustained product, process, and quality improvements.
What We Can Offer You:
Health & Wellbeing
We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.
Requirements
- 6+ years of hands-on experience in production switching and data center networking environments.
- Demonstrated ability to troubleshoot network incidents end-to-end using logs, packet captures, telemetry, and diagnostics, including physical and link validation as well as Layer-2 and Layer-3 fault isolation.
- 3+ years of experience troubleshooting modern data center fabrics and overlays, including EVPN/VXLAN-class designs, with familiarity across control-plane and data-plane interactions and multicast traffic patterns in production environments.
- Working knowledge of engineering tools and environments, including Linux VMs, Visual Studio, Jira/Confluence, Python, and Bash.
Escalation & Operational Skills
- 6+ years of experience leading high-severity, customer-impacting incidents or escalations.
- Demonstrated ability to make sound technical and risk-based decisions in ambiguous, time-sensitive situations with incomplete data.
- Demonstrated ownership, accountability, and follow-through in high-pressure situations.
Communication & Judgment
- Strong verbal and written communication skills, with the ability to communicate effectively with both technical and non-technical audiences.
- Ability to translate complex technical issues into clear, concise, and actionable guidance for senior stakeholders.
- Demonstrated composure, clarity, and professionalism during high-pressure or emotionally charged situations., * Experience partnering directly with Engineering, QA, and Product teams to investigate product defects and drive technical resolution.
- Ability to read code and perform targeted technical analysis to support troubleshooting and engineering escalation.
- Background in customer-facing escalation, site reliability engineering, support engineering, or Tier-4 technical support roles.
- Experience contributing to post-mortems, root cause reviews, and systemic quality improvements.
Work Environment & Expectations
- This role involves sustained computer-based work, high cognitive load, and frequent context switching across complex technical issues.
- May require participation in critical incidents outside standard business hours on an as-needed basis.
- Success in this role requires comfort with ambiguity, strong accountability, and the ability to prioritize effectively under pressure.
Benefits & conditions
"The expected salary/wage range for this position is provided below. Actual offer may vary from this range based upon geographic location, work experience, education/training, and/or skill level.
- United States of America: Annual Salary USD 136,500 - 276,500 in California The listed salary range reflects base salary. Variable incentives may also be offered."