Senior Engineer, Reliability Engineering - London Stock Exchange Group
Role details
Job location
Tech stack
Job description
A member of successful global multi-skilled team to provide a standard server platform to application servers for the world's largest professional trading community. Successful candidate will be responsible for supporting our existing virtual infrastructure, as well as project work developing new solutions. Responsible for maintaining OS and application stack across a wide variety of Linux, Windows and VMware systems. Collaborate with Architects on analysis and design for project work. Technically lead larger or technically sophisticated projects. Develop and deliver high quality work to deliver the Project and BAU work to the agreed timescale. Investigate incidents or problem reports raised on multiple environments, from Development through to Production servers. Provide systems management integration expertise to the development teams so all application servers are reliably deployed and monitored. Ensure sound configuration management practices are followed for deployments and improved where deficiencies are identified. Mentor other team members to increase their skills. Participation in periodic out of hours work to support to Production maintenance / updates (once or twice per month typically, usually Friday or Saturdays). Ensure work is accurately detailed to a high standard. Identify and implement improvements to the method of working, including opportunities for automation. Maintain and improve the tools used by the Infrastructure teams, to improve efficiency.
Requirements
Experience with Red Hat Enterprise Linux (or derivatives) is necessary, as is a solid grasp of HPE and Dell hardware. Hands-on experience of VMware technologies, specifically vSphere 7 & 8, is desirable for the role. Experience with VMware vSAN, vCloud Suite, vRealize Automation and Aria Operations products would be useful. Good understanding of IP networking within a Local Area Network (LAN) environment is important, and an awareness of routing and Wide Area Networks (WAN) is highly desired. Knowledge and skills in low latency focused environments with an understanding of analytical tools an advantage, but not critical. Experience working on critical infrastructure where quality, resilience and performance are essential to maintain the reputation of the product is desirable. Scripting skills, including Bash Shell, Perl or Python, would be an advantage. Ability to develop automated installation packages a bonus, but a willingness to learn new tools is just as meaningful. Successful candidates should have excellent verbal communication and written documentation skills. Should work well in a distributed team and be flexible working with people from other time zones. Attention to detail is expected, quality and service stability is key. Should have the ability to quickly understand wider impacts of changes and technical issues, or the confidence to admit when help is needed.