Cloud Platform Engineer - AWS SRE
Lorien
Glasgow, United Kingdom
yesterday
Role details
Contract type
Temporary contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
EnglishJob location
Glasgow, United Kingdom
Tech stack
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Bash
Identity and Access Management
Python
Reliability Engineering
Datadog
Scripting (Bash/Python/Go/Ruby)
Cloud Platform System
Snowflake
Grafana
Mttr
Amazon Web Services (AWS)
Performance Monitor
Functional Programming
Cloudwatch
Terraform
Splunk
Job description
- Lead incident triage and resolution for AWS and Snowflake services;
- monitor alerts, dashboards, and service health;
- perform root cause analysis and drive post-incident improvements;
- maintain runbooks and support on-call rotations; and automate repetitive operational tasks to improve resilience and reduce MTTR.
Requirements
We are looking for an AWS Site Reliability Engineer (SRE) with strong incident operations experience to support and improve the reliability of cloud and data platform services across AWS and Snowflake.
The role focuses on proactive monitoring, rapid incident response, service restoration, root cause analysis, and operational automation.
The ideal candidate will have hands-on experience with AWS infrastructure, Snowflake operations, observability tooling, and on-call support in production environments., * Strong knowledge of AWS services such as EC2, S3, IAM, VPC, Lambda, and CloudWatch;
- experience with Snowflake administration and troubleshooting;
- familiarity with observability tools such as CloudWatch, Datadog, Grafana, or Splunk;
- understanding of SRE concepts including SLIs, SLOs, error budgets, and incident management; and scripting or automation skills in Python, Bash, or Terraform.