Site Reliability Engineer

Vsg Business Solutions

Cleveland, United States of America

31 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

$ 90K

Job location

Cleveland, United States of America

Tech stack

Java

.NET

Amazon Web Services (AWS)

Application Services

Cloud Computing

Continuous Integration

Disaster Recovery

Amazon DynamoDB

Python

PostgreSQL

Node.js

Reliability Engineering

Software Engineering

TypeScript

Grafana

Gitlab

Amazon Web Services (AWS)

Bots

Cloudwatch

Api Gateway

Amazon Web Services (AWS)

Terraform

Software Version Control

Docker

Programming Languages

Job description

Caniddate will have to be in CLE for in-person interview and onboarding the first week of employment. We have another contractor opening for the Site Reliability Engineer position and they are looking for:

Local to Cleveland is first preference
in office first week of employment
Bachelor's degree required
As much experience in AWS as possible
GitLab experience

ssential Accountabilities:

Develops highly complex solutions (utilizing available tech stack) to improve ability to effectively monitor application services in a large-scale and complex environment. Suggests improvement of existing tools and monitoring thresholds.
Provides highly complex technical assistance and operational guidelines for business operations and application development to ensure applications are running optimally in production, test, and development environments.
Ensures that supported application services are highly available, reliable, and performant through monitoring, alerting, and notification. Design, implement, and maintain as necessary new Observability tools to ensure this coverage. Implements and maintains dashboard, bots and other automation based on the current operational needs and current release changes. Evaluate improvement of the dashboards, bots, and other automation.
Identifies repetitive, manual, and scalable tasks and automates them using scripting/programming languages or tools.
Identifies key operational metrics and the data necessary to create them. Implements and maintains dashboards based on the current operational needs. Test and ensure that all infrastructure components meet proper performance and capacity standards.

Requirements

Bachelor's degree and a minimum 5 years of related work experience
AWS Certifications

Knowledge and Skill Areas:

Advanced baseline knowledge of AWS Cloud Platform technologies, infrastructure, and practices in production environment including CloudWatch, Cloud Trail, EKS, Lambda, Canaries, DynamoDB, RDS, PostgreSQL, S3, API Gateway, Elastic Load Balancer, OpenSearch, Grafana, AWS X-Ray, SQS, Fault Injection Service (AWS FIS).
GitLab, CDK (preferred), Terraform, Grafana, OpenSearch, Docker and CI/CD pipeline.
Coding languages, such as Python, Typescript, NodeJS, .Net, Java; Infrastructure as Code, Configuration as Code, Alerts and Monitoring as Code.
Familiar with Deployment patterns and version control, ITIL framework, Resiliency concepts and Disaster Recovery, and Chaos Engineering.