SC Cleared Python Data Engineer - Azure & PySpark

Montash Limited

2 days ago

Role details

Contract type

Temporary contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Compensation

£ 104K

Job location

Tech stack

Unit Testing

Azure

Cloud Computing

Cloud Computing Security

Continuous Integration

Data Governance

Dependency Injection

DevOps

Document Management Systems

Distributed Data Store

Python

Data Processing

Test Driven Development

Data Lake

PySpark

Data Pipelines

Serverless Computing

Docker

Databricks

Job description

We are seeking an experienced Python Data Engineer to support the design, development, and optimisation of Azure-based data pipelines. The focus of this role is to deliver scalable, test-driven, and configuration-driven data processing solutions using Python, PySpark, Delta Lake, and containerised workloads. This opportunity sits within a fast-paced engineering environment working closely with cloud, DevOps, and data science teams. Client details remain confidential., * Develop and maintain ingestion, transformation, and validation pipelines using Python and PySpark

Implement unit and BDD testing with Behave, including mocking, patching, and dependency management
Design and manage Delta Lake tables, ensuring ACID compliance, schema evolution, and incremental loading
Build and maintain containerised applications using Docker for development and deployment
Develop configuration-driven, modular, and reusable engineering solutions
Integrate Azure services including Azure Functions, Key Vault, and Blob Storage
Collaborate with cloud architects, data scientists, and DevOps teams on CI/CD processes and environment configuration
Tune and troubleshoot PySpark jobs for performance in production workloads
Maintain documentation and follow best practices in cloud security and data governance

Requirements

Strong Python programming skills with test-driven development
Experience writing BDD scenarios and unit tests using Behave or similar tools
Skilled in mocking, patching, and dependency injection for Python tests
Proficiency in PySpark and distributed data processing
Hands-on experience with Delta Lake (transactional guarantees, schema evolution, optimisation)
Experience with Docker for development and deployment
Familiarity with Azure Functions, Key Vault, Blob Storage or Data Lake Storage Gen2
Experience working with configuration-driven systems
Exposure to CI/CD tools (Azure DevOps or similar)

Preferred Qualifications

Experience working with Databricks or Synapse
Knowledge of data governance, security, and best practices in the Azure ecosystem
Strong communication and collaboration skills, ideally within distributed teams