Site Reliability Engineer (SRE)
Right Skale
South San Francisco, United States of America
5 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
South San Francisco, United States of America
Tech stack
Airflow
Amazon Web Services (AWS)
Azure
Cloud Computing
Continuous Integration
DevOps
Github
Scrum
Prometheus
Datadog
Google Cloud Platform
System Availability
Grafana
Spark
Cloudformation
Kafka
Terraform
Data Pipelines
Jenkins
Job description
- Maintain high availability, performance, and reliability of production systems
- Participate in on-call rotation; troubleshoot incidents and perform root cause analysis
- Work within Agile/Scrum teams (sprint planning, stand-ups, retrospectives)
- Build and support data pipelines (batch and/or real-time)
- Develop and maintain CI/CD pipelines to improve deployment efficiency
- Automate operational tasks and improve system observability
Requirements
- 8+ years in SRE, DevOps, or Production Engineering
- Strong experience with production support and incident management
- Hands-on experience with CI/CD tools (Jenkins, GitHub Actions, etc.)
- Experience with data pipelines (Airflow, Spark, Kafka, etc.)
- Familiarity with cloud platforms (AWS, Azure, or Google Cloud Platform)
- Experience working in Agile/Scrum environments
Nice to Have:
- Experience in financial services or investment management
- Monitoring/observability tools (Datadog, Prometheus, Grafana)
- Infrastructure as Code (Terraform, CloudFormation)