Site Reliability Engineer (Golden Signals Lead)

Zelis Healthcare
St. Petersburg, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

St. Petersburg, United States of America

Tech stack

ASP.NET
.NET
Amazon Web Services (AWS)
Azure
C Sharp (Programming Language)
Cloud Computing
DevOps
Monitoring of Systems
Performance Tuning
Powershell
Reliability Engineering
Prometheus
Datadog
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
React
Grafana
Reliability of Systems
Containerization
Kubernetes
Information Technology
Performance Monitor
Front End Software Development
Splunk
New Relic (SaaS)
Docker
Web Api

Job description

We are seeking a strategic and results-oriented Site Reliability Engineer (Golden Signals Lead) to define and drive the observability roadmap across all platforms., This role is responsible for establishing a consistent and scalable approach to monitoring and alerting, leveraging golden signals to enhance system reliability and operational efficiency. The successful candidate will collaborate closely with the ZEIT SRE team, engineering leads, and India-based resources to build a unified observability strategy aligned with organizational goals., Observability Roadmap Development:

  • Define a unified vision for observability across all platforms, with golden signals as the foundation for monitoring and alerting.
  • Develop and maintain a comprehensive roadmap to improve observability, reduce tool redundancy, and standardize practices across platforms.
  • Establish and track key performance indicators (KPIs) to measure progress and ensure accountability for roadmap milestones.

Collaboration and Alignment:

  • Partner with the ZEIT SRE team and engineering leads to break down silos and promote consistent observability practices.
  • Drive cross-platform collaboration to reduce operational inconsistencies and define a 'north star' approach for observability.
  • Facilitate knowledge sharing to ensure alignment on current and future observability initiatives.

Monitoring and Alerting:

  • Standardize the implementation of golden signals across applications to improve system reliability and incident detection.
  • Optimize alerting tools and reduce redundant or ineffective monitoring interfaces ('panes of glass').
  • Lead efforts to enhance observability while minimizing operational overhead for platform teams.
  • Maintain and enhance observability dashboards, delivering actionable insights into application health and performance.

Operational Support and Improvement:

  • Identify and address gaps in existing observability practices, prioritizing long-term scalability and reliability.
  • Collaborate with India-based resources to execute observability build-outs efficiently and with high quality.
  • Reduce client, provider, and print facility-raised issues through proactive monitoring and early detection.

Reporting and Continuous Improvement:

  • Measure and report on observability success metrics, including actionable alert volume and reduced issue escalations.
  • Continuously evaluate and refine observability strategies based on stakeholder feedback and evolving organizational needs.

Requirements

  • Bachelor's degree in Computer Science, Information Technology, or a related field (or equivalent experience)., * Minimum of 5 years of experience in Site Reliability Engineering, DevOps, or a related role with a strong focus on observability.
  • 5+ years of hands-on experience with .NET (C#), including advanced knowledge of ASP.NET Core, Web APIs, and performance optimization.
  • Demonstrated success in designing and implementing monitoring and alerting solutions across complex IT environments.

Technical Skills:

  • Deep understanding of SRE principles and golden signals for system monitoring.
  • Proficiency with observability tools such as Prometheus, Grafana, Splunk, New Relic, or Datadog.
  • Familiarity with cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes).
  • Advanced proficiency in scripting languages such as PowerShell.
  • Experience in front-end development using React.js.
  • Advanced knowledge of .NET

Soft Skills:

  • Strong leadership and collaboration abilities, with a proven ability to align diverse teams toward common goals.
  • Excellent analytical and problem-solving skills, with a proactive approach to identifying and resolving issues.
  • Clear and effective communication skills, capable of conveying technical concepts to stakeholders at all levels.

Preferred Qualifications:

  • Experience with building observability roadmaps and scaling solutions in enterprise environments.
  • Certifications in cloud or DevOps-related disciplines (e.g., AWS Certified DevOps Engineer, Kubernetes Administrator).

About the company

Zelis is modernizing the healthcare financial experience across payers, providers, and healthcare consumers. We serve more than 750 payers, including the top five national health plans, regional health plans, TPAs and millions of healthcare providers and consumers across our platform of solutions. Zelis sees across the system to identify, optimize, and solve problems holistically with technology built by healthcare experts - driving real, measurable results for clients. At Zelis, AI is woven into the fabric of how we work. Every associate is expected - and empowered - to partner with AI to challenge the status quo, accelerate innovation, and amplify their impact. This is a place for builders with a growth mindset who act with agility, embrace change, and use modern technology to shape smarter solutions, exceptional experiences, and the future of our industry for our clients, customers, and our culture. A Little About You You bring a unique blend of personality and professional expertise to your work, inspiring others with your passion and dedication. Your career is a testament to your diverse experiences, community involvement, and the valuable lessons you've learned along the way. You are more than just your resume; you are a reflection of your achievements, the knowledge you've gained, and the personal interests that shape who you are., Zelis is headquartered in the U.S., with multiple locations across the country and in Hyderabad, India. Check out our locations to learn more about our offices. All employee work locations are based on the needs of the position and are determined by the Leadership team. In-office work and activities vary based on work and team objectives in accordance with Company policies.

Apply for this position