Site Reliability Engineer

IBA InfoTech Inc.
Denver, United States of America
11 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Denver, United States of America

Tech stack

Java
Amazon Web Services (AWS)
Azure
Python
Linux System Administration
Reliability Engineering
Ansible
Shell Script
Google Cloud Platform
Kubernetes
Infrastructure Automation Frameworks
Information Technology
Build Tools
Puppet
Docker

Job description

Site Reliability Engineering is a pivotal role in the success of this project. Our SREs ensure that the platform software and platform automation is robust, reliable, and scalable.

As a Site Reliability Engineer, you will:

  • triage and remediate production incidents

  • provide engineering-level support for issues reported by users

  • work closely with development teams to improve the observability of the system

  • aggressively automate remediations for common problems

  • build tools to facilitate rapid triage and troubleshooting

  • build tools to enable continuous monitoring of production systems

  • measure service level objectives

  • define and improve service level objectives

Requirements

  • Linux Administration
  • AWS
  • Demonstrated ability to write programs using Java based technologies/Scala.
  • Experience in shell scripting using Python or Shell
  • Must have experience managing cloud production distributed application stack in AWS/Azure/Google cloud
  • Docker and Container Orchestration experience
  • Hands on Experience using configuration management tools like Ansible, Chef or Puppet is a big plus

Apply for this position