lead pyspark engineer - data, sas, aws.

Randstad UK
Charing Cross, United Kingdom
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
£ 99K

Job location

Charing Cross, United Kingdom

Tech stack

Amazon Web Services (AWS)
Amazon Web Services (AWS)
Data analysis
Unit Testing
Big Data
Code Review
Information Engineering
ETL
Data Mart
Data Warehousing
Software Debugging
DevOps
Distributed Data Store
Identity and Access Management
Python
Modular Design
Performance Tuning
SAS (Software)
Data Streaming
Systems Integration
Data Logging
Data Processing
Macros
Spark
GIT
PySpark
SAS DI Studio
Data Pipelines

Job description

As a Lead PySpark Engineer, you will design, develop, and fix complex data processing solutions using PySpark on AWS. You will work hands-on with code, modernising legacy data workflows and supporting large-scale SAS-to-PySpark migrations. The role requires strong engineering discipline, deep data understanding, and the ability to deliver production-ready data pipelines in a financial services environment.

Requirements

PySpark & Data Engineering

  • Minimum 5+ years of hands-on PySpark experience.
  • SAS to Pyspark migration experience
  • Proven ability to write production-ready PySpark code.
  • Strong understanding of data and data warehousing concepts, including: ETL/ELT, Data models, Dimensions and facts, Data marts, SCDs

Spark Performance & Optimisation

  • Strong knowledge of Spark execution concepts, including partitioning, optimisation, and performance tuning.
  • Experience troubleshooting and improving distributed data processing pipelines.
  • Python & Engineering Quality
  • Strong Python coding skills with the ability to refactor, optimise, and stabilise existing codebases.
  • Experience implementing parameterisation, configuration, logging, exception handling, and modular design.

SAS & Legacy Analytics

  • Strong foundation in SAS (Base SAS, SAS Macros, SAS DI Studio).
  • Experience understanding, debugging, and modernising legacy SAS code.

Data Engineering & Testing

  • Ability to understand end-to-end data flows, integrations, orchestration, and CDC.
  • Experience writing and executing data and ETL test cases.
  • Ability to build unit tests, comparative testing, and validate data pipelines.

Engineering Practices

  • Proficiency in Git-based workflows, branching strategies, pull requests, and code reviews.
  • Ability to document code, data flows, and technical decisions clearly.
  • Exposure to CI/CD pipelines for data engineering workloads.

AWS & Platform Skills

  • Strong understanding of core AWS services, including: S3, EMR / Glue, Workflows, Athena, IAM
  • Experience building and operating data pipelines on AWS.
  • Big data processing on cloud platforms.

Desirable Skills

  • Experience in banking or financial services.
  • Experience working on SAS modernisation or cloud migration programmes.
  • Familiarity with DevOps practices and tools.
  • Experience working in Agile/Scrum delivery environments.

Apply for this position