Site Reliability Engineer

Descriptionwalt Labs
Epsom, United Kingdom
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Epsom, United Kingdom

Tech stack

Bash
Monitoring of Systems
Python
Reliability Engineering
Datadog
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Grafana
Kubernetes
Terraform
Go

Requirements

Qualifications3-5 years experience with Google Cloud PlatformMinimum 2 Google Cloud Professional certificationsAdvanced Kubernetes knowledge and troubleshootingProficient in Infrastructure as Code (Terraform)Strong scripting abilities (Python, Go, Bash)Expert with monitoring tools (Grafana, Datadog)Experience leading incident responseExcellent communication and mentoring skillsProven track record of process improvementAbility to manage multiple priorities effectivelyStrong customer service orientation Benefits20 holiday days + bank holidays (earn 1.5 days every 3 years)Private health insurance

About the company

Company DescriptionWALT Labs, a leading managed service provider, is dedicated to empowering businesses by harnessing the power of cloud technology. Our team specializes in delivering customized solutions tailored to meet the unique needs of our clients, driving growth and operational efficiency across industries. From supporting small businesses with seamless data migration to enabling large corporations to manage complex infrastructure projects, we provide exceptional service while staying at the forefront of cloud technology advancements. Role DescriptionThis is a full-time on-site role 3 days a week minimum in Kings Cross London. We are seeking a skilled Site Reliability Engineer with a strong focus on Google Cloud Platform (GCP) to join our dynamic team. In this role, you'll be responsible for maintaining cloud infrastructure, managing incidents, and ensuring seamless operations for our clients. You'll use tools like incident.io and JIRA to manage and resolve support requests efficiently. ResponsibilitiesServe as L2 on-call escalation point for complex technical issues requiring advanced troubleshootingLead response to critical incidents, coordinating multiple teams and ensuring effective communicationProvide expert-level support for GCP services including advanced networking, security, and architecturePerform advanced Google Workspace administration including domain management, security policies, and integrationUse incident.io to manage escalated incidents, major incidents, and coordinate war room activitiesOptimize support workflows in JIRA, creating automation rules and improving ticket routingMonitor and tune infrastructure performance using advanced Grafana queries and custom metricsLead technical projects including migrations, upgrades, and new service implementationsCreate comprehensive documentation including architectural diagrams, runbooks, and best practices guidesAchieve minimum 50% billable hours through complex Cloud Assist/Managed Cloud customers and consulting engagementsMentor Cloud Support Engineers and juniors through formal and informal training sessionsIdentify and implement process improvements to increase efficiency and reduce resolution timeConduct thorough root cause analysis for recurring issues and implement permanent fixesPresent technical solutions and recommendations to customer stakeholders and managementDesign and implement monitoring strategies for complex multi-cloud environmentsDevelop automation scripts and tools to improve team efficiency and reduce manual workParticipate in pre-sales activities providing technical expertise for solution designReview and approve changes to production environments following change management proceduresLead knowledge sharing sessions and technical deep-dives for the teamCoordinate with vendor support for complex issues requiring manufacturer assistanceMaintain expertise in multiple GCP services and stay current with new feature releasesParticipation in business hours escalation

Apply for this position