SC Cleared Python Data Engineer - Azure & PySpark
Montash Limited
2 days ago
Role details
Contract type
Temporary contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Compensation
£ 104KJob location
Tech stack
Unit Testing
Azure
Cloud Computing
Cloud Computing Security
Continuous Integration
Data Governance
Dependency Injection
DevOps
Document Management Systems
Distributed Data Store
Python
Data Processing
Test Driven Development
Data Lake
PySpark
Data Pipelines
Serverless Computing
Docker
Databricks
Job description
We are seeking an experienced Python Data Engineer to support the design, development, and optimisation of Azure-based data pipelines. The focus of this role is to deliver scalable, test-driven, and configuration-driven data processing solutions using Python, PySpark, Delta Lake, and containerised workloads. This opportunity sits within a fast-paced engineering environment working closely with cloud, DevOps, and data science teams. Client details remain confidential., * Develop and maintain ingestion, transformation, and validation pipelines using Python and PySpark
- Implement unit and BDD testing with Behave, including mocking, patching, and dependency management
- Design and manage Delta Lake tables, ensuring ACID compliance, schema evolution, and incremental loading
- Build and maintain containerised applications using Docker for development and deployment
- Develop configuration-driven, modular, and reusable engineering solutions
- Integrate Azure services including Azure Functions, Key Vault, and Blob Storage
- Collaborate with cloud architects, data scientists, and DevOps teams on CI/CD processes and environment configuration
- Tune and troubleshoot PySpark jobs for performance in production workloads
- Maintain documentation and follow best practices in cloud security and data governance
Requirements
- Strong Python programming skills with test-driven development
- Experience writing BDD scenarios and unit tests using Behave or similar tools
- Skilled in mocking, patching, and dependency injection for Python tests
- Proficiency in PySpark and distributed data processing
- Hands-on experience with Delta Lake (transactional guarantees, schema evolution, optimisation)
- Experience with Docker for development and deployment
- Familiarity with Azure Functions, Key Vault, Blob Storage or Data Lake Storage Gen2
- Experience working with configuration-driven systems
- Exposure to CI/CD tools (Azure DevOps or similar)
Preferred Qualifications
- Experience working with Databricks or Synapse
- Knowledge of data governance, security, and best practices in the Azure ecosystem
- Strong communication and collaboration skills, ideally within distributed teams