IT Service Resilience Manager
Role details
Job location
Tech stack
Job description
We have an exciting opportunity for a IT Service Resilience Manager to join our IT team, based in A&O Shearman's Belfast office. Information Technology team - Belfast Accountable for translating business continuity and availability requirements into technical enterprise architecture and operational disaster recovery requirements, and for owning the program of resilience and recovery testing across applications, SaaS and third-party providers. What you will do The successful candidate should have strong technical hands-on skills in Cloud, infrastructure and applications deployments, along with the ability to translate business continuity and availability requirements into technical enterprise architecture along with operational disaster recovery requirements. It will be their responsibility to own the program of resilience and recovery testing across applications, SaaS and third-party providers and internal teams. Key Stakeholders; I&O Support teams & InfoSec Business Continuity Management Regional IT Support team Technical Delivery and Project & Programme Delivery Software vendors and Managed Service Providers Responsibilities; Ownership and leadership execution in the following areas: Develop and maintain dependency maps that capture application, middleware, cloud services, data flows and third-party dependencies to identify single points of failure and inform resilience design Lead DR implementation and testing program: design automated DR processes when feasible, schedule regular tests Own operational runbooks, monitoring and incident playbooks aligned to graceful degradation modes; ensure monitoring and SRE/operations practices are aligned to expected degradation behaviors. Coordinate crisis response governance and periodic scenario exercises with crisis response teams, define activation criteria, maintain war-room procedures and ensure lessons-learned feed back into architecture and DR plans. Run supplier resilience assessments for critical SaaS/third parties using a posture assessment approach; escalate remediation, negotiate contractual improvements or recommend contingencies/alternative sourcing. Develop and maintain dependency maps that capture application, cloud services, data flows and third-party dependencies to identify single points of failure and inform resilience design. Collaborate with enterprise architecture, security/CISO, application owners, BC/operational leads and procurement to embed resilience standards across lifecycle Manage and test application/service tiers to business-agreed RTO/RPO and reliability design targets Define and ensure adherence to DR/Resilience programme metrics: frequency of tests, % successful automated DR runs, closure rate for remediation actions identified through testing Manage vendor performance and contractual compliance of vendors agreed operational SLAs and vendor contingency plans validated via tests. Identify and assess IT resilience risks related to system outages, cyber threats and 3rd party dependencies
Requirements
10+ years in technology resilience, disaster recovery, or IT operations, with 5+ years in leadership positions managing cross-functional teams. Deep hands-on knowledge of a range of IT environments, SaaS, cloud infrastructure (AWS & Azure), and security tools. Required expertise in ISO 22301, NIST and ITIL Certifications (Preferred): Certified Business Continuity Professional (CBCP), CISSP, CISM, or DRI International certifications Experience of communicating to senior stakeholders and interpreting complex technical solutions to simple language. Exposure of working in both Agile and Waterfall delivery methodologies. Personal Ability to anticipate risks and shift from reactive disaster recovery to proactive service resilience, focusing on "prevention by design" Skilled at navigating changing technology environments (e.g. cloud, DevOps) and leading transformation Strong stakeholder engagement and influence skills to work with EA, Platform Owners, I&O, InfoSec, Business Continuity, Procurement Proven ability to manage crisis situations, make quick, informed decisions during incidents, and maintain confidence (strategic optimism) within teams Excellent customer-facing skills with a good grasp of key drivers and requirements within the business. Understanding of how technology resilience directly impacts business operations, continuity, and profitability.