Data Resilience Technology Specialist
Role details
Job location
Tech stack
Job description
The Data Resilience team is a new chapter within the Chief Data and Analytics Office (CDAO). It has the responsibility to define and embed new Strategies, Operating Models and Control Frameworks to protect the Banks critical data services that our customers, colleagues and the market rely upon.
The aim of the Data Resilience Team is to protect our customers, colleagues and markets by ensuring we comply with the spirit of the regulatory requirements for operational resilience established by the Bank of England, FCA and PRA.
In this role you will be instrumental in safeguarding critical data services for customers, colleagues, and markets. Deliver the intent of PRA SS1/21, FCA PS21/3, and DORA within the CDAO Data Resilience team.
In this role you will:
- Design, implement, and maintain infrastructure resilience solutions across on-prem and cloud environments
- Define and deliver hosting and recovery strategies for critical platforms and services
- Evaluate and recommend architectures based on RTO, RPO, cost, and performance
- Develop and automate recovery runbooks for infrastructure and application services
- Execute and orchestrate restore workflows, integrity checks, and cutover processes
- Lead and run disaster recovery drills with Network, Infrastructure, Platform, and Data teams
- Capture findings and drive remediation to strengthen resilience posture
- Protect core platforms: databases, messaging queues, batch schedulers, file/object stores
- Standardize and optimize backup and snapshot strategies across multiple technologies
- Integrate resilience processes with ServiceNow (Incident, Change, Problem)
- Maintain CMDB accuracy for infrastructure, backup tools, storage, and network components
- Track and report resilience metrics: backup success, recovery success, RPO, MTTR
- Present performance and risk reduction to senior leadership
- Ensure compliance with resilience standards and regulatory requirements
- Feed outcomes into Data Resilience Assessments
Requirements
- Hands-on delivery experience of infrastructure solutions for resilience and recovery
- Cloud and on-prem expertise: compute, storage, networking, segmentation, connectivity
- Proven disaster recovery experience: automated restore, rebuild, and cutover
- Backup and recovery for databases, messaging queues, batch jobs, and large-scale data stores
- Disaster recovery drills: end-to-end execution, audit trails, RTO/RPO compliance
- Strong ITIL knowledge: Major Incident, Problem, and Change leadership
- ServiceNow experience: CMDB modelling, Discovery, Flow Designer, workflow automation
- Security-first mindset: encryption, key rotation, secrets management, dual control
- Automation skills: PowerShell, Python, Terraform, or equivalent
- Stakeholder engagement across CISO, Platforms, Networks, DBAs, and Risk teams
- Regulatory awareness: PRA SS1/21, FCA PS21/3, DORA; IBS and impact tolerance in practice