Site Reliability Engineer

Omega, Inc.
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Tech stack

Query Performance
Automation of Tests
Database Design
Scrum
Security Software
Software Engineering
Delivery Pipeline
Indexer
Build Management
Programming Languages

Job description

Partner with the architecture and development teams on how to make applications highly available,

reliable, and performant at a global scale

Collaborate with the architecture team to ensure Reliability factors are accounted for in business

features and enablers

Guide development teams in understanding established service level objectives and consequences,

and implementing appropriate SLIs to support the objectives.

Collaborate with development team members to swarm, troubleshoot, and resolve problems.

Guide ad-hoc teams to brainstorm solutions and build implementation plans based on the Root Cause

Analysis of production issues

Design and build automated solutions to optimize application/service/platform uptime with minimal

human intervention

Be available for an on-call rotation to participate in troubleshooting and communication efforts

outside of normal business hours

Implement and help create standards and best practices, and mentor other team members in order

to drive adoption across development teams

Perform other duties as assigned

Conform with all company policies and procedures

Requirements

Expert in defining, implementing, and evaluating Service Level Objectives (SLO) and Service Level

Indicators (SLI), and associated consequences

Software development expertise in two or more high-level programming and scripting languages

Experience in evolutionary database design, query performance analysis, and indexing as a

cornerstone for delivering scalable, performant products and services

Experience in designing, building, and optimizing automated pipelines with automated testing and

automated security controls

Experience in performing Root Cause Analysis and Problem Management

Experience working in Agile Scrum teams with demonstrated success leading improvements (getting

better/faster/happier)

Apply for this position