Site Reliability Engineer

K Anand Corporation

Detroit, United States of America

2 days ago

Role details

Contract type

Temporary contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

Detroit, United States of America

Tech stack

JavaScript

Microsoft Windows

Agile Methodologies

Amazon Web Services (AWS)

Unix

Cloud Computing

Databases

Linux

DevOps

Java Platform Enterprise Edition (J2EE)

Spring

Python

Network Protocols

Oracle Applications

Scrum

Reliability Engineering

Ansible

Shell Script

PL-SQL

Enterprise Software Applications

Spring-boot

Gitlab

Information Technology

Amazon Web Services (AWS)

REST

Terraform

Splunk

Dynatrace

Devsecops

Microservices

Requirements

The Skills You Bring

Bachelor s degree in computer science, Engineering, or related fields preferred (or equivalent practical experience)

Strong verbal and written communication skills

Experience of overall 4-8 years of managing an SRE or DevOps team with observability workload.

4-8 years of Agile Management owning SRE roadmaps and deliverables using Scrum / Kanban

4-8 years of delivering projects alongside a constant flow of side intake and production response workloads

Experience presenting to leadership and collaborate effectively/communicate technical concepts to non-technical business stakeholders

Proven 5+ years' experience as a Site Reliability Engineer or similar role in a production environment

Applied AWS/Cloud Certification (AWS Cloud Architect, DevOps/SysOps) including experience with ASG, Fargate, Lambda, Aurora DB, Dynamo DB, ALB/NLB

5+ years' working experience with CI/CD pipelines (Gitlab) and developing infrastructure-as-code (Terraform, Python, Ansible, etc.)

Applied experience with Linux and Windows platforms, Java EE, JavaScript, Spring, Spring Boot, REST API/Micro Services, Shell Scripting, Python, PL/SQL, and databases, specifically Oracle

Working knowledge of observability platforms like Splunk, Dynatrace

Working experience with designing Observability for enterprise applications

Experienced knowledge of system administration, DevSecOps

Development experience along with cloud and physical servers

Understanding and experience working with business, product and engineering teams in developing SLI, SLO and SLA's

Conduct capacity planning and resource optimization to handle growing demands on our infrastructure

Other Skills & Experience Desired

Strong knowledge of Linux/Unix systems and network protocols

Familiarity with cybersecurity best practices and principles, Ability to lead triage calls including working across multiple divisions to resolve issues.