Site Reliability Engineer (SRE) / Cloud Engineer
Role details
Job location
Tech stack
Job description
We are hiring a Senior Site Reliability Engineer (SRE) / Cloud Engineer to join our client's engineering and operations team on a contract basis. This is a fantastic opportunity to work on mission-critical applications, ensuring high availability, performance, and scalability across modern cloud environments. You'll play a key role in driving system reliability, observability, and performance engineering, working closely with cross-functional teams to support and enhance production systems., o Monitor application performance and availability using APM and observability tools (e.g., Dynatrace, Splunk, Prometheus, Grafana) o Perform full-stack production support and triage, identifying and resolving performance and stability issues o Conduct root cause analysis (RCA) and implement preventive measures to avoid recurrence o Define and track Service Level Objectives (SLOs), SLIs, and error budgets in collaboration with stakeholders o Design and develop dashboards, alerts, and reports for system health and performance metrics o Partner with engineering teams to analyze application architecture and eliminate single points of failure o Execute performance and resilience testing (including chaos testing) to ensure system stability o Automate operational processes using scripting (Python, Shell) and DevOps tools o Collaborate with DevOps teams on CI/CD pipelines and infrastructure as code implementations o Support cloud-based environments (AWS/Azure) and ensure scalability, fault tolerance, and high availability
Requirements
o 8+ years of experience in Site Reliability Engineering, DevOps, or Production Support roles o Strong expertise in Linux/Unix environments o Hands-on experience with monitoring and APM tools such as Dynatrace, AppDynamics, Splunk, Prometheus, or Grafana o Experience with cloud platforms (AWS and/or Azure) o Proficiency in CI/CD tools (Jenkins, Terraform, Ansible, Bitbucket, etc.) o Strong scripting skills in Python, Shell, or Bash o Experience with Java-based applications and modern front-end frameworks (React or Angular) o Solid understanding of databases (Oracle, SQL Server, or similar) o Knowledge of SRE principles (SLOs, SLIs, error budgets, observability) o Strong troubleshooting, analytical, and problem-solving skills o Excellent communication skills and ability to work with cross-functional teams
Benefits & conditions
Hourly Rate: $65.00 - $70.00 per hour, The Company offers the following benefits for this position, subject to applicable eligibility requirements: medical insurance, dental insurance, vision insurance, 401(k) retirement plan, life insurance, long-term disability insurance, short-term disability insurance, paid parking/public transportation, (paid time, paid sick and safe time, hours of paid vacation time, weeks of paid parental leave, paid holidays annually - AS Applicable)