Incident Problem Manager

Adecco

Charing Cross, United Kingdom

3 days ago

Role details

Contract type

Temporary to permanent

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Compensation

£ 156K

Job location

Charing Cross, United Kingdom

Tech stack

Agile Methodologies

Databases

DevOps

Mttr

Servicenow

Job description

We are seeking an experienced and governance-focused Incident and Problem Manager to oversee the effective management of IT incidents and problems across the organisation's technology landscape.

In this critical role, you will ensure that incidents, including major incidents, are resolved promptly to minimise business disruption and that underlying problems are identified, analysed, and addressed to prevent recurrence.

You will provide strategic and operational oversight of incident and problem management processes, ensuring robust governance and compliance with regulatory and operational resilience frameworks, including DORA.

You will also drive continuous improvement initiatives, strengthen operational resilience, and safeguard critical business services by embedding best practices and governance standards across the technology estate., * Lead the end-to-end management of incidents, including major incidents to ensure rapid restoration of services and minimal business disruption.

Collaborate on major incident bridges, coordinating cross-functional teams to drive timely resolution and maintain clear, consistent stakeholder communication during high-impact events.
Ensure escalation protocols and communication plans are executed effectively during major incidents to keep senior leadership, regulators, and impacted business units informed in real time.
Oversee incident trend analysis and reporting to senior leadership and regulators to identify systemic issues, improve response strategies, and support compliance obligations.
Ensure incident processes align with DORA requirements including impact classification, response timelines, and regulatory reporting to maintain operational resilience.
Own the problem management lifecycle from identification through resolution and closure to eliminate root causes and prevent recurrence of incidents.
Drive structured root cause analysis (RCA) using methodologies such as 5 Whys or Kepner-Tregoe to ensure accurate diagnosis and effective long-term solutions.
Maintain and govern the Known Error Database (KEDB) to provide documented workarounds and enable faster incident resolution.
Collaborate with engineering and product teams to implement permanent fixes to improve service reliability and reduce operational risk.
Embed DORA-aligned practices into incident and problem management processes including ICT risk classification and critical service mapping to strengthen resilience.
Support scenario testing and resilience assessments for critical business services to validate preparedness and compliance with regulatory standards.
Contribute to regulatory reporting and audit readiness for operational resilience and ICT incident handling to ensure transparency and adherence to governance requirements.
Partner with Risk, Compliance, and Business Continuity teams to align incident and problem management with broader resilience objectives.
Mentor and guide junior analysts and managers within the service management function to build capability and maintain high standards of performance.
Drive automation and tooling enhancements for incident/problem detection and resolution to improve efficiency and reduce mean time to restore (MTTR).
Provide insights and recommendations to improve service reliability and reduce operational risk to support continuous improvement and strategic objectives.
Lead service reviews and post-incident/post-problem retrospectives with accountable owners to capture lessons learned and implement process improvements.

Requirements

Extensive experience in Incident and Problem Management within financial services or other regulated industries.
Proven track record of managing major incidents, conducting root cause analysis (RCA), and implementing permanent fixes.
Strong knowledge and practical application of ITIL principles (v4 preferred).
Demonstrated experience working with DORA compliance, operational resilience frameworks, and regulatory obligations.
Familiarity with ITSM platforms (e.g., ServiceNow) and monitoring tools.
Ability to operate under pressure and manage complex, high-impact situations.
Excellent stakeholder management, communication, and leadership skills.
Strong analytical and problem-solving capabilities.
Experience with cloud and hybrid infrastructure environments.
Understanding of DevOps and Agile delivery models.
Ability to drive continuous improvement and embed best practices across ITSM processes.

About the company

Pontoon is an employment consultancy. We put expertise, energy, and enthusiasm into improving everyone's chance of being part of the workplace. We respect and appreciate people of all ethnicities, generations, religious beliefs, sexual orientations, gender identities, and more. We do this by showcasing their talents, skills, and unique experience in an inclusive environment that helps them thrive. As part of our standard hiring process to manage risk, please note background screening checks will be conducted on all hires before commencing employment.