Integration Reliability Engineer

CLARYO, INC.
San Francisco, United States of America
16 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 170K

Job location

San Francisco, United States of America

Tech stack

Amazon Web Services (AWS)
Azure
Cloud Computing
Software Debugging
Linux
Distributed Systems
Virtual Private Networks (VPN)
Networking Basics
Cloud Services
Prometheus
WebRTC
Cloud Platform System
Grafana
Reliability of Systems
Event Driven Architecture
RTSP
Kubernetes
Kafka
Stream Processing
Data Pipelines

Job description

We're looking for a Integration Reliability Engineer to own the reliability of our system across cloud, edge, and real-world environments. Our platform runs across distributed infrastructure-connecting cloud services, on-site compute, and live video/data pipelines inside warehouses. This role is responsible for making systems observable, diagnosable, and repeatable as we scale across deployments. You'll work closely with engineering and deployment teams to ensure the system performs reliably in production-not just in ideal conditions., * Own reliability of systems across cloud (Kubernetes), edge compute, and on-site deployments

  • Build and maintain monitoring, alerting, and observability systems
  • Define and improve incident response, severity levels, and on-call processes
  • Improve deployment and bring-up workflows across facilities
  • Diagnose issues across infrastructure, networking, and distributed systems
  • Partner with engineering to identify root causes and prevent recurring issues
  • Improve system visibility, debugging, and operational tooling
  • Help make deployments repeatable and scalable across sites

Requirements

Do you have experience in Technical troubleshooting support?, * 3+ years of experience in SRE, infrastructure, or distributed systems

  • Strong Linux and networking fundamentals
  • Experience operating systems in production environments
  • Experience working with networking in constrained or distributed environments (e.g., VPNs, secure tunnels, on-site networking)
  • Experience with:
  • Kubernetes and containerized systems
  • Cloud platforms (GCP, AWS, or Azure)
  • Observability tools (Prometheus, Grafana, OpenTelemetry, etc.)
  • Ability to debug issues across multiple layers of the stack (infra services network)
  • Comfortable working in real-world, imperfect environments (not just clean cloud systems)
  • Strong ownership and ability to drive issues to resolution, * Experience with multi-site or edge deployments
  • Experience with event-driven systems (Kafka or similar)
  • Familiarity with video or streaming systems (RTSP, WebRTC)
  • Experience working with hardware-integrated systems
  • Exposure to security/compliance frameworks (SOC2, ISO27001, etc.)
  • US citizen/ permanent resident
  • Located in SFBAY or NY area

Benefits & conditions

Pulled from the full job description

  • Parental leave
  • Health insurance
  • 401(k) matching
  • Vision insurance
  • Dental insurance, At Claryo, we offer a competitive benefits package that supports your health and well-being, including - top-tier medical, dental, and vision coverage, 401k with employer matching, equity, parental leave, and unlimited vacation.

Compensation Range: $150K - $170K

Apply for this position