Lead Cloud Engineer
Role details
Job location
Tech stack
Job description
Do you have a passion for technology and DevOps practices? Do you want to make a difference by developing and managing cloud infrastructure used by some of the biggest companies in the world and a critical player in the market infrastructure? We are looking for a skilled Lead Cloud Engineer with VMware, AWS, Azure, and Google experience to join the LCH SA Infrastructure & Cloud Team that provides technical resources to Post Trade Business Services (a division of LSEG). This role is vital in designing, deploying, and maintaining resilient, scalable, and highly available cloud infrastructure. Collaboration with development teams, operations, and partners will be crucial in ensuring efficient performance and availability of our critical environments. Key responsibilities and accountabilities
- Implements, tunes, and conducts ongoing administration of infrastructure layer, including proposing application systems changes, better uses and improvements.
- Improves the whole lifecycle of services from inception and design, through deployment, operation, and refinement.
- Leads support services before they launch through activities such as system design consulting, developing features of the infrastructure platforms and frameworks, capacity planning, and launch reviews.
- Provides mentorship to other team members on handling availability and performance of critical services, on building automation to prevent problem recurrence, and on building automated responses for non-exceptional service conditions.
- Maintains services once they are live by measuring and monitoring availability, latency, and overall system health.
- Leads sustainable and efficient incident response.
- Scales systems sustainably through mechanisms like automation, evolving evolve systems by advocating for changes that improve reliability and velocity.
- Writes highly optimized and accurate code for LCH SA and LSEG products and solutions.
- Proactively continues to build and apply relevant domain knowledge that may relate to workflows, data pipelines, business policies, configurations and constraints.
- Supports essential processes while ensuring high quality standards are met.
What you'll be doing:
- Design, deploy, and maintain highly available and scalable infrastructure solutions to support our critical applications and services. This will incorporate security and compliance standards, with an understanding of various range of IaaS or PaaS services (compute, storage, database, security services).
- Build operational tools for deployment, monitoring, and analysis of critical infrastructure. Develop and improve automation tools, scripts, and frameworks to streamline administration tasks, improve efficiency, and reduce manual effort. Build and maintain monitoring solutions to proactively identify and resolve performance issues.
- Automate deployment, configuration, and maintenance processes using infrastructure-as-code (IaC) tools and technologies.
- Run the platform: maintain in operational condition the infrastructure platform.
- Collaborate with infrastructure teams to estimate resource requirements, plan for future growth, and ensure infrastructure scalability to meet evolving business needs.
- Design and implement robust backup and recovery strategies for various services, ensuring integrity and quick recovery in case of failures or disasters.
- Participate in incident management activities, including root cause analysis, mitigation, and resolution of related incidents.
- Embed into cloud projects and on-call rotations to keep your skills sharp and stay close to the operational workflows and issues.
- Work on systems: edge cases, failure modes, behaviors, specific implementations.
Requirements
Do you have experience in VMware?, Do you have a Bachelor's degree?, * Bachelor's degree in computer science, Engineering, or a related field (or equivalent experience),
- Experience on VMware technology (at least 2 years),
- Experience (at least 5 years) as a Cloud Engineer, SRE or similar role, with a focus on managing and maintaining Cloud infrastructure in regulated and or highly available environment.
- Expertise in cloud AWS, Azure, Google platforms (at least 3 years),
- Expertise (at least 3 years) in scripting languages (Python, Bash), and on automation/configuration management tools (Ansible, Puppet, Chef),
- Expertise in containerization technologies (Docker, Kubernetes) is advantageous,
- Experience in Linux/Unix systems (at least 5 years),
- Experience on operating in a regulated environment with specific security constraints (at least 2 years),
- Experience on Systems Monitoring and Capacity Management (at least 5 years),
- Adheres to applicable risk control frameworks, policies and procedures.