Platform Site Reliability Engineer
Role details
Job location
Tech stack
Job description
We are currently seeking an experienced professional to join our team in the role of Platform Site Reliability Engineer.
You'll support the systematic application of engineering approaches to ensure service resilience, sustainability and recovery time objectives are met for all the software solutions through operational stability, integrity, and availability of products and services.
You'll ensure the identification and resolution of all incidents, develop and deploy engineering changes to infrastructure capable of meeting anticipated load, performance, availability, security and resilience requirements.
You'll support the delivery of infrastructure changes with a focus on automation of build, testing and deployment process on all environments to minimise variation and ensure predictable high-quality code and data.
You'll be proficient in working in an agile way with a high empathy for continuous delivery and DevSecOps must be able to do small, low-risk, high cadence change. Typically, will release to production daily or multiple times per day., * Simplify and improve technology architecture by understanding owned component dependencies and driving incremental simplification.
- Proactively manage technical debt, taking initiative to identify and fix issues before they're assigned.
- Support service performance and operational resilience, including monitoring, incident resolution, vendor management, risk management, and development efficiency.
- Contribute to platform strategy and demand planning, focusing on re-usability and solution insights.
- Use evidence-led engineering by running proof of concepts, tests, and external research to validate approaches
- Own systems with an ITSO mindset, demonstrating empathy for what it takes to run and support production services end-to-end
- Provide subject matter expertise across teams, supporting delivery and enabling others through guidance and collaboration
- Lead incident reviews and continuity activities, including PIR leadership, taking ownership of infrastructure, and leading role swaps/DR drives with improvements from lessons learned
Requirements
- Fundamentals-based problem-solving skills; Drive decision by function, first principles-based mind-set
- Experience programming in at least one of the following languages: Java, Python, GO
- Demonstrate bias-to-action and avoid analysis-paralysis, drive action to the finish line with high quality and on time
- Solid experience in cloud technologies such as AWS/Goole/ALI
- Experience of supporting across virtualised and/or containerised environments particularly those employing Kubernetes or Docker for workload management
- You are ego-less when searching for the best ideas; You contribute effectively outside of your specialty; You think about solving problems from the standpoint of best outcome for the team
- Systematic problem-solving approach, coupled with excellent communication skills and a sense of ownership and drive
- Can debug and optimise code, while automating routine tasks (i.e., TOIL reduction)
Benefits & conditions
As an HSBC employee in the UK, you will have access to tailored professional development opportunities and a competitive pay and benefits package. This includes private healthcare for all UK-based employees, enhanced maternity and adoption pay and support when you return to work, and a contributory pension scheme with a generous employer contribution.