Site Reliability Engineer (SRE)

Valstro

Charing Cross, United Kingdom

18 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

Remote

Charing Cross, United Kingdom

Tech stack

Amazon Web Services (AWS)

Systems Engineering

Azure

Bash

Python

Reliability Engineering

Prometheus

Software Engineering

Datadog

Data Logging

Google Cloud Platform

Cloud Platform System

Grafana

Reliability of Systems

Containerization

Kubernetes

Information Technology

Terraform

Docker

Job description

Valstro is looking for a Site Reliability Engineer (SRE), to join our team! This person will help ensure the reliability, availability, and performance of our cloud native trading platform. The role entails building and maintaining infrastructure, automating process and working closely with the Development and Platform teams to ensure seamless integration and deployment of the service.

The successful candidate will serve as an essential link between the wider organization, executive leadership, and external vendors. Their responsibilities will include ensuring system reliability, building and maintaining monitoring solutions for both production and UAT systems, automating operational tasks, responding to incidents, and continuously improving systems and processes.

This is a remote position that will report to the Site Reliability Lead. What will you be doing?

Act as a key intermediary between engineering, executive leadership, and external vendors.
Ensure the reliability, availability, and performance of our cloud-based trading solutions.
Develop and maintain monitoring solutions to track system performance and reliability.
Automate operational tasks to improve efficiency and reduce manual intervention.
Collaborate with development teams to ensure seamless integration and deployment.
Respond to incidents and troubleshoot issues to minimize downtime.
Continuously improve systems and processes to enhance reliability and performance.
Participate in on-call rotations to provide 24/7 support for critical systems.

Requirements

Do you have experience in Terraform?, Do you have a Bachelor's degree?, 3+ years experience supporting Production level systems

Strong experience in site reliability engineering, systems engineering, or a related field.
Proficiency in cloud-based infrastructure (e.g. AWS, Azure, or Google Cloud.)
Experience with monitoring and logging tools (e.g., ELK, LGTM, Prometheus, Datadog).
Expertise in automation and scripting (e.g., Golang, Python, Bash, Terraform).
Knowledge of containerization and orchestration (e.g., Docker, Kubernetes).
Ability to effectively communicate and liaise between stakeholders, including internal teams, executive management and external vendors.
Strong troubleshooting and problem-solving skills.
Experience in establishing and enhancing reliability engineering practices and processes.
Capable of operating effectively in a dynamic organizational environment with high delivery and quality expectations.

Fintech = bonus

Technical

A recent bachelor's degree in Computer Science, Software Engineering or related field
Knowledge of SREing
Knowledge of observability and tooling particularly the Grafana stack

Benefits & conditions

Valstro offers an excellent benefits package, including pension or 401 (k) plans, unlimited PTO and highly competitive compensation. Our leadership team brings a wealth of experience and deep industry knowledge, and despite being a young company, we believe we have carefully dialed in our product-market fit.

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

Apply for this position

Good distractions

Moments

Videos View all