Senior Data Engineer
Role details
Job location
Tech stack
Job description
Senior Data Engineer - PySpark, GCP
We're recruiting for a Senior Data Engineer with strong PySpark skills, to join a growing team in building out a modern Lakehouse platform in GCP.
This is a hands-on role for someone who can take ownership, work independently, work iteratively and delivery quickly.
Key skills and experience:
- Strong experience as a Data Engineer in Cloud Environments (ideally GCP but not essentially)
- Hands on experience with PySpark and SQL
- Experience building data lake/lakehouse platforms
- Strong understanding of distributed data processing (Spark)
- Experience with Airflow or similar orchestration tools
- Ability to design and deliver pipelines end-to-end, independently
Nice to have:
- Strong GCP experience
- Terraform/Infrastructure-as-Code
- Experience with data quality tools (Great Expectations, Soda)
- Exposure to data catalog/governance tools
- Experience of working towards/setting up AI, Agentic Workflows etc
What you'll be doing:
- Building and optimising PySpark data pipelines
- Orchestrating workflows using Airflow
- Designing scalable lakehouse architectures
- Implementing data quality and validation
- Collaborating with analysts to deliver usable datasets
This role will be virtually remote, with just 1 day per week required in Central London. It will run for an initial 3 months with likelihood of extension and will be paid at £500 - £550 per day, outside of IR35.
For more informaiton on this excellent client and role, please respond with an up to date CV via the links provided.
Requirements
This is a hands-on role for someone who can take ownership, work independently, work iteratively and delivery quickly., * Strong experience as a Data Engineer in Cloud Environments (ideally GCP but not essentially)
- Hands on experience with PySpark and SQL
- Experience building data lake/lakehouse platforms
- Strong understanding of distributed data processing (Spark)
- Experience with Airflow or similar orchestration tools
- Ability to design and deliver pipelines end-to-end, independently
Nice to have:
- Strong GCP experience
- Terraform/Infrastructure-as-Code
- Experience with data quality tools (Great Expectations, Soda)
- Exposure to data catalog/governance tools
- Experience of working towards/setting up AI, Agentic Workflows etc
What you'll be doing:
- Building and optimising PySpark data pipelines
- Orchestrating workflows using Airflow
- Designing scalable lakehouse architectures
- Implementing data quality and validation
- Collaborating with analysts to deliver usable datasets
Benefits & conditions
This role will be virtually remote, with just 1 day per week required in Central London. It will run for an initial 3 months with likelihood of extension and will be paid at £500 - £550 per day, outside of IR35.
For more informaiton on this excellent client and role, please respond with an up to date CV via the links provided.