Site Reliability Engineer
IBA InfoTech Inc.
Denver, United States of America
11 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
EnglishJob location
Denver, United States of America
Tech stack
Java
Amazon Web Services (AWS)
Azure
Python
Linux System Administration
Reliability Engineering
Ansible
Shell Script
Google Cloud Platform
Kubernetes
Infrastructure Automation Frameworks
Information Technology
Build Tools
Puppet
Docker
Job description
Site Reliability Engineering is a pivotal role in the success of this project. Our SREs ensure that the platform software and platform automation is robust, reliable, and scalable.
As a Site Reliability Engineer, you will:
-
triage and remediate production incidents
-
provide engineering-level support for issues reported by users
-
work closely with development teams to improve the observability of the system
-
aggressively automate remediations for common problems
-
build tools to facilitate rapid triage and troubleshooting
-
build tools to enable continuous monitoring of production systems
-
measure service level objectives
-
define and improve service level objectives
Requirements
- Linux Administration
- AWS
- Demonstrated ability to write programs using Java based technologies/Scala.
- Experience in shell scripting using Python or Shell
- Must have experience managing cloud production distributed application stack in AWS/Azure/Google cloud
- Docker and Container Orchestration experience
- Hands on Experience using configuration management tools like Ansible, Chef or Puppet is a big plus