Site Reliability Engineer
Role details
Job location
Tech stack
Requirements
The Skills You Bring
Bachelor s degree in computer science, Engineering, or related fields preferred (or equivalent practical experience)
Strong verbal and written communication skills
Experience of overall 4-8 years of managing an SRE or DevOps team with observability workload.
4-8 years of Agile Management owning SRE roadmaps and deliverables using Scrum / Kanban
4-8 years of delivering projects alongside a constant flow of side intake and production response workloads
Experience presenting to leadership and collaborate effectively/communicate technical concepts to non-technical business stakeholders
Proven 5+ years' experience as a Site Reliability Engineer or similar role in a production environment
Applied AWS/Cloud Certification (AWS Cloud Architect, DevOps/SysOps) including experience with ASG, Fargate, Lambda, Aurora DB, Dynamo DB, ALB/NLB
5+ years' working experience with CI/CD pipelines (Gitlab) and developing infrastructure-as-code (Terraform, Python, Ansible, etc.)
Applied experience with Linux and Windows platforms, Java EE, JavaScript, Spring, Spring Boot, REST API/Micro Services, Shell Scripting, Python, PL/SQL, and databases, specifically Oracle
Working knowledge of observability platforms like Splunk, Dynatrace
Working experience with designing Observability for enterprise applications
Experienced knowledge of system administration, DevSecOps
Development experience along with cloud and physical servers
Understanding and experience working with business, product and engineering teams in developing SLI, SLO and SLA's
Conduct capacity planning and resource optimization to handle growing demands on our infrastructure
Other Skills & Experience Desired
Strong knowledge of Linux/Unix systems and network protocols
Familiarity with cybersecurity best practices and principles, Ability to lead triage calls including working across multiple divisions to resolve issues.