PySpark Developer
Role details
Job location
Tech stack
Job description
We are looking for an experienced PySpark Developer with strong Microsoft Fabric and Azure engineering skills to join a major transformation programme within the financial-markets domain. This role is fully hands-on, focused on building and optimising large-scale data pipelines, dataflows, semantic models, and lakehouse components. Key Responsibilities
- Design, build and optimise Spark-based data pipelines for batch and streaming workloads
- Develop Fabric dataflows, pipelines, and semantic models
- Implement complex transformations, joins, aggregations and performance tuning
- Build and optimise Delta Lake / delta tables
- Develop secure data solutions including role-based access, data masking and compliance controls
- Implement data validation, cleansing, profiling and documentation
- Work closely with analysts and stakeholders to translate requirements into scalable technical solutions
- Troubleshoot and improve reliability, latency and workload performance
Requirements
-
Strong hands-on experience with PySpark, Spark SQL, Spark Streaming, DataFrames
-
Microsoft Fabric (Fabric Spark jobs, dataflows, pipelines, semantic models)
-
Azure: ADLS, cloud data engineering, notebooks
-
Python programming; Java exposure beneficial
-
Delta Lake / Delta table optimisation experience
-
Git / GitLab, CI/CD pipelines, DevOps practices
-
Strong troubleshooting and problem-solving ability
-
Experience with lakehouse architectures, ETL workflows, and distributed computing
-
Familiarity with time-series, market data, transactional data or risk metrics Nice to Have
-
Power BI dataset preparation
-
OneLake, Azure Data Lake, Kubernetes, Docker
-
Knowledge of financial regulations (GDPR, SOX)