Senior Lead Software Engineering - AI/ML Engineer

JPMorgan Chase & Co.
Charing Cross, United Kingdom
6 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Charing Cross, United Kingdom

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Data Systems
Disaster Recovery
Python
Machine Learning
Reliability Engineering
Prometheus
Software Engineering
Datadog
Snowflake
Grafana
PySpark
Kubernetes
Data Management
Splunk
Dynatrace
Databricks

Job description

As a Site Reliability Engineer in the AI/ML Data Platforms team, you will play a key role in building scalable and resilient data solutions. You will engage in root cause analysis, production changes, and operational improvements, while supporting budgetary and staffing decisions. You will mentor team members and partner with colleagues across the organization to drive strategic change. Your contributions will help shape a collaborative, innovative, and high-performing team culture., * Demonstrate expertise in application development and support across technologies such as Databricks, Snowflake, AWS, and Kubernetes

  • Coordinate incident management coverage to ensure effective resolution of application issues
  • Collaborate with cross-functional teams to perform root cause analysis and implement production changes
  • Develop and support AI/ML solutions for troubleshooting and incident resolution
  • Mentor and guide team members to foster growth and drive strategic change
  • Build and maintain scalable, resilient, and market-leading data solutions
  • Support budgetary and staffing considerations to optimize team performance
  • Engage in operational stability and disaster recovery planning
  • Implement automation tools to reduce toil and improve efficiency
  • Ensure compliance with risk controls and company-wide standards
  • Build meaningful relationships across teams to achieve common goals

Requirements

  • Proficient in site reliability culture and principles, with experience implementing site reliability within applications or platforms
  • Skilled in running production incident calls and managing incident resolution
  • Experienced in observability, including white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, and Splunk
  • Strong understanding of SLI/SLO/SLA and Error Budgets
  • Proficient in Python or PySpark for AI/ML modeling
  • Able to reduce toil by building automation tools for repeated tasks
  • Hands-on experience in system design, resiliency, testing, operational stability, and disaster recovery
  • Awareness of risk controls and compliance with departmental and company-wide standards
  • Collaborative team player with the ability to build meaningful relationships

Preferred Qualifications, Capabilities, and Skills:

  • Experience in an SRE or production support role with AWS Cloud, Databricks, Snowflake, or similar technologies
  • AWS and Databricks certifications
  • Advanced knowledge of AI/ML troubleshooting and incident resolution
  • Familiarity with budgetary and staffing optimization
  • Experience mentoring and guiding team members
  • Strong communication and interpersonal skills
  • Demonstrated ability to drive strategic change across teams

Benefits & conditions

We offer a competitive total rewards package including base salary determined based on the role, experience, skill set and location. Those in eligible roles may receive commission-based pay and/or discretionary incentive compensation, paid in the form of cash and/or forfeitable equity, awarded in recognition of individual achievements and contributions. We also offer a range of benefits and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site health and wellness centers, a retirement savings plan, backup childcare, tuition reimbursement, mental health support, financial coaching and more. Additional details about total compensation and benefits will be provided during the hiring process.

About the company

Join us to shape the future of AI/ML data platforms, where your expertise will help create resilient and market-leading solutions. You will have the opportunity to collaborate with innovators across our global network, driving strategic change and mentoring others. We value your skills in solving complex challenges and fostering a culture of reliability and growth. At JPMorganChase, your impact will reach far beyond your team, opening doors to career advancement and meaningful relationships., JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management., J.P. Morgan's Commercial & Investment Bank is a global leader across banking, markets, securities services and payments. Corporations, governments and institutions throughout the world entrust us with their business in more than 100 countries. The Commercial & Investment Bank provides strategic advice, raises capital, manages risk and extends liquidity in markets around the world.

Apply for this position