Site Reliability Engineer
Role details
Job location
Tech stack
Job description
Asylon is hiring a site reliability engineer to join our Philadelphia team. You'll be responsible for the reliability, availability, and performance of systems that span cloud infrastructure, on-prem servers in airgapped customer environments, and Kubernetes clusters running on edge devices deployed with our robots in the field. You'll define and maintain SLOs, build observability into every layer of the stack, lead incident response, and drive the automation that keeps our systems running without manual intervention. This role sits at the intersection of infrastructure engineering and operations - you should be as comfortable writing code to eliminate toil as you are triaging an outage on a remote edge device.
Due to the nature of the projects worked on in this position, applicants must be a U.S. Person as defined by 22 C.F.R. §120.62. This includes U.S. Citizens, lawful permanent residents, refugees, or asylees.
Primary Duties
- Own the reliability of production systems across cloud, on-prem, and edge environments - define SLOs, track error budgets, and drive improvements
- Build and maintain observability infrastructure - monitoring, alerting, logging, and dashboards - to provide visibility into system health at every layer
- Lead incident response, conduct blameless post-mortems, and implement remediation to prevent recurrence
- Develop automation to reduce toil, improve deployment reliability, and enable self-healing infrastructure
- Build and maintain CI/CD pipelines for service deployment, testing, and infrastructure provisioning
- Manage Kubernetes clusters (K3s on edge, on-prem, and managed cloud clusters) - deployments, upgrades, and troubleshooting
- Manage infrastructure-as-code for reproducible provisioning across cloud and airgapped on-prem environments
- Collaborate with software and robotics engineers to build reliability into systems from the design phase
Requirements
Do you have experience in Tooling?, * 3+ years of professional experience in SRE, DevOps, or infrastructure engineering
- Strong with a high-level language such as Python, Go, or Bash for building automation and tooling
- Proficient with Kubernetes - deploying, operating, debugging, and scaling containerized workloads
- Experience building and operating observability stacks - Prometheus, Grafana, Loki, or similar monitoring, alerting, and logging tools
- Background in CI/CD pipelines for automated testing, building, and deploying services
- Proficient with Linux systems administration and troubleshooting
- Experience with infrastructure-as-code tools such as OpenTofu, Terraform, or Ansible
- Comfortable with networking fundamentals - DNS, firewalls, VPNs, and debugging connectivity issues across distributed environments
Bonus Points
- Experience with K3s or lightweight Kubernetes on edge - running services on resource-constrained hardware in the field
- Has worked in airgapped or disconnected environments where systems must operate without cloud dependencies
- Experience with on-call rotations and structured incident management processes
- Familiarity with message brokers and streaming (MQTT, NATS, Kafka, or similar) for real-time data pipelines
- Has worked with robotics or IoT systems, particularly managing fleets of remote devices
- Experience with video streaming or processing pipelines in a production environment
- Comfortable getting hands-on with hardware - you don't need to be an embedded expert, but you should be the kind of person who's built robots in a college club, tinkered with a Raspberry Pi, or isn't afraid to plug into a serial console and debug a device on a bench
- Experience with Bazel or similar build systems for managing complex, multi-language codebases
- Experience with capacity planning and performance engineering
Benefits & conditions
$118,000 - $150,000 a year - Permanent, Full-time, Pulled from the full job description
- 401(k)
- Health insurance
- 401(k) matching
- Paid time off
- Vision insurance
- Dental insurance
- Relocation assistance, Competitive salary and equity packages
401(k) and 401(k) matching
Flexible vacation/sick time
Medical, dental, and vision insurance
Life insurance
Paid time off