Senior DevOps Engineer

Circle Cloud Communications Ltd

Southampton, United Kingdom

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

£ 85K

Job location

Southampton, United Kingdom

Tech stack

Proxmox

VoIP

Databases

Continuous Integration

CouchDB

Data Centers

Software Debugging

DevOps

Disaster Recovery

Monitoring of Systems

PostgreSQL

Linux System Administration

MongoDB

MySQL

Operational Databases

Migration Manager

Ansible

Runbook

Server Virtualization

Toolchain

Virtualization Technology

Ceph

Data Storage Management

Computer Network Operations

Docker Swarm

Delivery Pipeline

Containerization

Kubernetes

Storage Technologies

Bare Metal

Kamailio

Vertica

Terraform

Multiplatform

Docker

Job description

The Senior DevOps Engineer owns everything related to server infrastructure, virtualisation, containerisation, storage architecture, and the DevOps toolchain. This is a deeply technical, hands-on role with broad autonomy to define how our systems are architected, built, and operated - with clear accountability to deliver results.

You will work alongside the Infrastructure Manager to align your work with operational needs and business priorities. You are expected to produce structured proposals and business cases for significant changes, present them to the Infrastructure Manager for approval, and lead the end-to-end implementation.

About the Environment

On-prem, bare-metal compute infrastructure across four sites
Proxmox VE for virtualisation with Ceph and NFS-based storage
Docker Swarm for container orchestration - Kubernetes migration under evaluation
Production database platforms: MongoDB, CouchDB, PostgreSQL, MySQL, ClickHouse
Business-critical telecoms services with stringent uptime requirements
Minimal public cloud usage - this is a bare-metal, on-prem environment, Virtualisation & Server Infrastructure
Own and operate the Proxmox VE cluster estate across all data centre sites
Manage and maintain bare-metal server lifecycle: provisioning, patching, hardware fault management, and decommissioning
Maintain performance, resilience, and capacity across physical and virtual server infrastructure
Manage VM templates, snapshots, resource allocation, and cluster health monitoring

Storage Architecture & Management

We have recently migrated from Ceph to NFS. The Senior DevOps Engineer will be responsible for owning the ongoing evaluation of our storage strategy, including:

Assessing whether the current NFS-based approach is fit for purpose at scale
Evaluating Ceph as a potential return candidate - producing a formal architecture proposal including design, resource requirements, risk analysis, and implementation plan
Presenting the proposal to the Infrastructure Manager with a clear recommendation
Leading the full implementation of whichever approach is agreed, including installation, configuration, and ongoing management
Maintaining backup integrity, replication, and recovery procedures for all storage systems

Container Orchestration & DevOps

Own and manage the Docker Swarm estate, including all running services and deployment workflows
Lead the evaluation of a potential migration from Docker Swarm to Kubernetes
Produce a detailed business case for the migration, covering architecture design, resource implications, migration strategy, risk, and phased rollout plan
Present the business case to the Infrastructure Manager, who will escalate to the board for sign-off
Lead the full implementation of the approved migration, including tooling setup, service migration, and handover documentation
Own all Docker instances across the estate, including configuration, monitoring, and lifecycle management

Database Operations

Manage production database platforms: MongoDB, CouchDB, PostgreSQL, MySQL, and ClickHouse
Ensure replication, resilience, backup integrity, and tested recovery procedures are in place for all database systems
Advise on database architecture and contribute to capacity planning

SIP Infrastructure & VoIP Platforms

We operate a multi-platform SIP estate spanning class 4 and class 5 switching, session border control, and hosted PBX infrastructure. The Senior DevOps Engineer is responsible for the operational maintenance and configuration of these platforms, working closely with the UC Engineering team on debugging, tracing, and stability.

Operate, maintain, and configure the Kamailio session border controller (SBC) estate
Administer and maintain FreeSWITCH-based infrastructure
Support and maintain Asterisk-based platforms, including PBXware
Operate and maintain the SIPwise C5 class 5 switch and Yeti class 5 switch
Perform SIP debugging and tracing to diagnose and resolve call flow, signalling, and media issues
Work collaboratively with the UC Engineering team to ensure stability, performance, and continuity of VoIP platforms
Support capacity planning, upgrades, and configuration changes across the SIP estate

Monitoring & Observability

Own the monitoring platform stack across infrastructure and services
Ensure alerting is effective, actionable, and covers all critical systems
Maintain and improve observability tooling, dashboards, and incident detection capability

Architecture & Proposals

Act as the technical authority on DevOps and infrastructure architecture decisions
Produce structured proposals and business cases for significant changes, including rationale, design, risk, and implementation plan
Collaborate with the Infrastructure Manager and other teams to align infrastructure with business priorities
Advise on how systems should be architected and proactively identify areas for improvement

Cross-Functional Collaboration

Work closely with the Infrastructure Manager to ensure DevOps and network operations are aligned
Collaborate with development and service teams to support deployments and service operations
Support the wider infrastructure team on DevOps-adjacent tasks and knowledge sharing

Performance KPIs

Infrastructure uptime and availability for virtualisation, storage, and compute platforms
Quality and timeliness of architecture proposals and business cases
Change success rate: reduction in failed or rolled-back changes in the DevOps estate
Backup integrity: all systems covered, tested, and recovery procedures validated
Monitoring coverage: all critical systems instrumented, with effective alerting in place
Documentation maturity: runbooks, diagrams, and SOPs maintained and current

Requirements

Hands-on Proxmox VE cluster operations in production environments
NFS storage management and administration
Ceph storage - architecture, deployment, and operations
Docker and Docker Swarm in production environments
Database operations across MongoDB, CouchDB, PostgreSQL, MySQL, and ClickHouse - including replication, backup, and recovery
Linux server administration at scale (bare metal and virtual)
Monitoring and observability tooling
Strong documentation discipline and ability to produce clear technical proposals
Experience working in a structured, production-critical environment

Desirable Experience

Kubernetes - design, implementation, and production operations
Experience migrating workloads from Docker Swarm or similar to Kubernetes
Familiarity with telecoms or VoIP infrastructure environments
CI/CD pipeline design and management
Infrastructure-as-code tooling (Terraform, Ansible, or similar) -> Terraform not used but Ansible yes, starting to use Terraform now

Working Style

Technically deep, detail-driven, and reliable in live production environments
Proactive: you identify problems before they become incidents and propose solutions
Structured thinker who can translate technical complexity into clear recommendations
Collaborative - you work effectively alongside the Infrastructure Manager and wider team
Ownership mindset: you take full responsibility for your domain and follow through
Growth-oriented: you actively develop your skills and stay current with the technology landscape

About the company

Circle Cloud Communications Ltd is a telecommunications provider operating carrier-grade, on-prem infrastructure. We run our own compute estate across four interconnected sites, with a production environment built on bare metal, Proxmox virtualisation, container orchestration via Docker Swarm, and a range of managed database platforms.