Site Reliability Engineer (SRE)

Right Skale
South San Francisco, United States of America
5 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

South San Francisco, United States of America

Tech stack

Airflow
Amazon Web Services (AWS)
Azure
Cloud Computing
Continuous Integration
DevOps
Github
Scrum
Prometheus
Datadog
Google Cloud Platform
System Availability
Grafana
Spark
Cloudformation
Kafka
Terraform
Data Pipelines
Jenkins

Job description

  • Maintain high availability, performance, and reliability of production systems
  • Participate in on-call rotation; troubleshoot incidents and perform root cause analysis
  • Work within Agile/Scrum teams (sprint planning, stand-ups, retrospectives)
  • Build and support data pipelines (batch and/or real-time)
  • Develop and maintain CI/CD pipelines to improve deployment efficiency
  • Automate operational tasks and improve system observability

Requirements

  • 8+ years in SRE, DevOps, or Production Engineering
  • Strong experience with production support and incident management
  • Hands-on experience with CI/CD tools (Jenkins, GitHub Actions, etc.)
  • Experience with data pipelines (Airflow, Spark, Kafka, etc.)
  • Familiarity with cloud platforms (AWS, Azure, or Google Cloud Platform)
  • Experience working in Agile/Scrum environments

Nice to Have:

  • Experience in financial services or investment management
  • Monitoring/observability tools (Datadog, Prometheus, Grafana)
  • Infrastructure as Code (Terraform, CloudFormation)

Apply for this position