Senior DevOps & Environments Engineer

Expleo
Charing Cross, United Kingdom
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Charing Cross, United Kingdom

Tech stack

Amazon Web Services (AWS)
Azure
Bash
Configuration Management
Databases
DevOps
Disaster Recovery
Middleware
Python
Load Testing
Powershell
Release Management
Runbook
Software Engineering
Virtualization Technology
vSphere
Datadog
Data Logging
Scripting (Bash/Python/Go/Ruby)
Containerization
Infrastructure Automation Frameworks
Hardware Infrastructure
Terraform
Dynatrace
VMware

Job description

The Senior DevOps & Environments Engineer will join a team responsible for the reliability, automation, provisioning, configuration, and continuous improvement of environments supporting the full software development lifecycle. The role is critical in ensuring that non-production environments are stable, performant, and aligned with the needs of engineering, QA, and release teams. Working across 30+ applications, this engineer will help modernise the environment landscape through Infrastructure-as-Code, improved observability, and SRE-aligned operational practices to enable faster, safer, and higher-quality delivery. Responsibilities:

  • Design, implement, and maintain Infrastructure-as-Code (IaC) for consistent and repeatable provisioning of development and test environments, primarily using Terraform.
  • Lead technical investigations and act as the escalation point for environment-related incidents, outages, configuration issues, and service degradation across non-production platforms.
  • Collaborate closely with development, QA, and platform teams to deliver scalable, automated, and resilient environment solutions.
  • Analyse and optimise performance of non-production systems, identifying and resolving environment bottlenecks.
  • Maintain environment fidelity and integrity through controlled configuration management, patching, visioning, and rollback strategies.
  • Support release and deployment planning, ensuring environment readiness, dependency alignment, and overall stability during release cycles.
  • Implement and maintain monitoring, observability, and logging frameworks, with a strong emphasis on Dynatrace and CNCF-aligned tooling.
  • Define meaningful, proactive alerting policies that reduce noise, highlight real issues, and accelerate response times.
  • Apply SRE principles such as SLIs/SLOs, automated remediation, and continuous feedback loops to improve environment uptime and reliability.
  • Mentor junior engineers, share best practices, and contribute to knowledge bases, documentation, and process maturity.
  • Support Disaster Recovery (DR) testing, validating end-to-end system recovery, integration behaviour, and service resilience during failover scenarios.
  • Champion automation and operational excellence, reducing manual effort and increasing the team's ability to deliver environments at scale., As a Disability Confident Committed Employer we have committed to:
  • Ensure our recruitment process is inclusive and accessible
  • Communicating and promoting vacancies
  • Offering an interview to disabled people who meet the minimum criteria for the job
  • Anticipating and providing reasonable adjustments as required
  • Supporting any existing employee who acquires a disability or long term health condition, enabling them to stay in work at least one activity that will make a difference for disabled people

Requirements

  • Strong knowledge of VMware, vSphere, virtualisation platforms, and on-premise infrastructure management.
  • Expertise in Terraform and experience defining an organisation-wide IaC strategy.
  • Proficient in scripting and automation (Python, Bash, PowerShell).
  • Strong communication, documentation, and collaborative problem-solving skills.

Experience:

  • Hands-on experience with on-premise infrastructure, virtualisation, containerisation, and exposure to cloud platforms such as AWS or Azure.
  • Understanding of performance engineering, including load testing frameworks and performance analysis.
  • Experience supporting QA, development, and release management teams with reliable, well-controlled non-prod environments.
  • Ability to troubleshoot complex multi-layered issues across infrastructure, networks, applications, middleware, and databases.
  • Familiarity with SRE principles and modern operational practices such as postmortems, runbooks, SLIs/SLOs, error budgets, and automated recovery patterns.
  • Experience with APM and observability tooling, ideally Dynatrace, including metrics, traces, dashboards, and alerting configuration.

Benefits & conditions

  • Collaborative working environment - we stand shoulder to shoulder with our clients and our peers through good times and challenges
  • We empower all passionate technology loving professionals by allowing them to expand their skills and take part in inspiring projects
  • Expleo Academy - enables you to acquire and develop the right skills by delivering a suite of accredited training courses
  • Competitive company benefits
  • Always working as one team, our people are not afraid to think big and challenge the status quo

Apply for this position