Site Reliability Engineer I
Role details
Job location
Tech stack
Job description
We are looking for a Site Reliability Engineer I to help support the stability, health, and day-to-day operations of Backblaze's infrastructure. This role serves as a first line of response for customer-affecting issues and production alerts, helping drive timely incident resolution, maintain service reliability, and support operational readiness across our environments. You will work closely with TechOps, Data Center Technicians, and other cross-functional teams to troubleshoot issues, monitor system health, support deployments and migrations, and improve day-to-day operational processes through documentation and automation. The ideal candidate is technically curious, calm under pressure, eager to learn, and excited to grow in a hands-on infrastructure and reliability role.
What You'll Do:
- Act as first point of contact for all customer affecting issues
- Be a Key Driver for managing the resolution of technical problems
- Ensure that incident management processes are following and that incident post-mortems are completed to capture process deviations and areas for improvement
- Deliver consistent communication to Management
- Respond to zabbix alerts/regular monitoring of zabbix, either by taking direct action on alerts or escalating. Acknowledge every alert if direct action taken, or with escalation point of contact.
- Make sure escalations are handed off successfully.
- Ensure health of pods across all sites (define pod alerts on zabbix).
- Work through daily filesystem checks for pods.
- Troubleshoot technical issues for DC Techs -> advanced pod questions, deployment questions, migration troubleshooting, and ansible playbook issues.
- Identification and escalating any potential issues regarding the network.
- Vault pre-deployment configuration and testing.
- Start Vault Migrations, monitor migration pods, handle applicable migration pod health checks.
- Document/Work on automating Daily Items.
- Document/Provide Network IP's for upcoming deployments.
- Monitor Releases/Updates to the Server Farm, escalate issues as they arise.
- Engaging in on-call rotation shifts.
- Assist fellow TechOps team members in handling tasks.
- Making recommendations for improvements in organizational productivity.
- Be able to work outside of normal business hours(weekend shift, holidays & evenings) as needed
Requirements
- Must be located in Bangalore.
- 2 - 4 years of relevant experience.
- Knowledge of Sysadmin and Linux skills.
- Desire to learn and develop all necessary technical skills.
- Strong analytical thinking.
- Strong skills in working with different teams and communication.
- Knowledge of network cabling, network classification, and network topology.
Benefits & conditions
- Annual Company bonus plan
- Healthcare for family, including dental and vision
- 401K
- ESPP program
- Flexible vacation policy
- Maternity & paternity leave
- MacBook Pro for work plus a generous stipend to personalize your workstation
- Childcare bonus (human children only)
- Fertility treatment and support
- Learning & development program
- Commuter benefits
- A culture that supports a healthy work-life balance
To provide greater transparency to candidates, we share base pay ranges for all US-based job postings regardless of state. We set standard base pay ranges for all roles based on function, level, and country location, benchmarked against similar-stage growth companies. Final offer amounts are determined by multiple factors, including candidate location, skills, depth of work experience, and relevant licenses/credentials, and may vary from the amounts listed below.
The expected salary range for this role is $66,000 - $88,000.