Incident and Problem Manager

Incident & Problem Managerelexon Ltd
Charing Cross, United Kingdom
9 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
£ 72K

Job location

Charing Cross, United Kingdom

Tech stack

Databases
IBM Service Management Framework
Servicenow

Job description

The Service Management team at Elexon is responsible for overseeing the full lifecycle of IT services, from planning and design to transition, operation, and continuous improvement, adhering to ITIL best practices. To manage its multi-supplier service delivery landscape, Elexon employs a Service Integration and Management (SIAM) operating model.

In this framework, the Service Management team plays a pivotal role, ensuring the coordination, integration, and governance of multiple service providers. The team is accountable for delivering cohesive, seamless IT services to both Elexon's business and the wider electricity supply industry. By implementing structured ITIL-based service management practices, the team ensures clear accountability, maintains service quality, and drives high performance across all service providers.

Job Purpose: The Incident and Problem Manager is responsible for leading the incident and problem management process across internal teams, third-party partners and market participants within the SIAM operating model.

Additionally, this role collaborates with stakeholders to manage major incidents, ensuring effective leadership, coordination, and inclusive decision-making. The focus is on timely resolution of business-impacting issues, minimising recurrence, and enhancing service availability through a structured approach.

Participation in an on-call support rota during extended service hours will be required., Incident and Problem Management: Accountable for the development and management of the Incident and Problem Management Process:

  • Responsible for establishing and maintaining an effective incident and problem management framework, ensuring alignment with IT industry best practices and company policies.
  • Analyse incident and problem trends, identifying opportunities for proactive risk reduction.
  • Conduct regular reviews and follow-ups with vendors and third parties to ensure accountability in resolving issues and ensuring they are closed out in a timely manner
  • Proactively identify potential problems through trend analysis, incident reviews, and monitoring system performance
  • Maintain a comprehensive problem management database, documenting all identified problems, their root causes, and resolution steps.
  • Work on strategies to prevent the recurrence of known problems and define and track KPIs to measure problem management success.
  • Ensure adherence to SLAs (Service Level Agreements) for incident & problem response and resolution.
  • Manage and maintain the Known Error Database (KEDB) for tracking known issues and workarounds.
  • Develop and maintain knowledge base articles to support proactive problem resolution

Case Management:

  • Ensure smooth coordination between incident, problem, and case management teams.
  • Maintain clear documentation of incidents and problems within the case management system.

Major Incident Management:

  • Oversee the Major Incident Management process; promotion from Incident to Major Incident, lead and coordinate the response, triage and drive resolution across internal teams, external participants and third-party partners ensuring expedited responses, effective communication and minimal business disruption.
  • Drive the major incident bridge through involving all relevant resolver groups, external participants and third-party partners coordinating with the respective SMEs for speedy resolution.
  • Facilitate root cause analysis (RCA) and Major Incident Reviews (MIRs) ensuring corrective actions and preventive measures are implemented and Incident Reports are issued within agreed SLAs.
  • Collaborate with teams to identify trends and patterns leading to major incidents and implement preventive measures.
  • Manage communication during major incidents, providing regular updates to key stakeholders, internal and external.
  • Conduct a thorough analysis and work with key stakeholders to prepare the Major Incident Report (MIR) for every Major Incident within agreed SLAs.
  • Ensure that all the resolution procedures are updated in the knowledge base.

Stakeholder Engagement: Act as primary liaison with stakeholders across IT, business and third-party vendors to ensure alignment, clear communication, and collaboration throughout the incident and problem management process.

Reporting: Regular management information reports covering incident and problem management including SLA performance, trends, and areas for improvement.

Compliance and Governance: Responsible for defining and enforcing adherence to incident management policies, processes and industry best practice across the SIAM framework.

Service Improvement: Collaborate with cross-functional teams, vendors, third parties and market participants to analyse problems and incidents and develop strategies for continuous improvement.

Requirements

  • 5+ years' experience of working to an ITIL based Service Management framework, with a focus on best practice within Incident Management and Problem Management
  • Confident in taken ownership and making on the spot decision in pressure situations
  • Ability to be on call during extended service hours
  • Ability to analyse data, trends, patterns, and correlations to identify potential problems.
  • A broad industry knowledge and experience gained in Incident and Problem Management with ability to define a process from zero to full maturity within a multi-disciplined team.
  • Excellent oral and written communication skills when presenting complex information to technical and non-technical audiences.
  • Problem-Solving: Strong analytical skills to assess risks, troubleshoot issues, and implement corrective actions.
  • Proficiency in using ITSM tools (e.g., ServiceNow) and collaboration platforms

Developmental / Desirable: *

  • Understanding UK electricity market and key regulations.
  • Knowledge of cybersecurity and risk management in energy IT services.
  • Familiarity with energy market regulations and Ofgem standards

About the company

We believe a diverse and inclusive culture allows innovation and creativity to flourish. We are committed to continuously improving our culture for our colleagues and stakeholders. Through our hugely successful Diversity Forum, Mental Health First Aid network and regular programme of activities and events, we celebrate difference and recognise the value of employee wellbeing, which is a consistent outcome from annual employee surveys that we conduct. Likewise, as a community, we like to support each other, and all agree Elexon is a great place to work with a great workspace too!

Apply for this position