Developer

Roleyou

Bramley, United Kingdom

1 month ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Bramley, United Kingdom

Tech stack

API

Azure

Code Generation

Software Quality

Data Validation

Data Governance

ETL

Data Masking

Data Systems

Distributed Computing Environment

Hive

Infrastructure as a Service (IaaS)

JSON

Jinja (Template Engine)

Python

NoSQL

Raw Data

Reference Data

Mockito

Kusto Query Language

Azure

Data Streaming

YAML

Enterprise Data Management

Parquet

Datadog

Data Processing

Spark

Azure

Indexer

Gitlab

Microsoft Fabric

Pytest

Data Lake

PySpark

Gitlab-ci

Integration Tests

Cosmos DB

Data Pipelines

Azure

Artifactory

Job description

The RoleYou will be part of a specialist engineering team responsible for designing, building, and optimising end-to-end financial instrument mastering pipelines. These pipelines span ingestion, normalisation, bi-temporal processing, and publication into enterprise data platforms.You will work closely with data architects, domain experts, and QC engineers to deliver scalable, reliable, and high-performance data solutions across Azure and Microsoft Fabric ecosystems.Key ResponsibilitiesBuild and maintain PySpark-based data pipelines for financial instrument mastering across multiple data sourcesDesign and implement bi-temporal data processing models (system time + valid time) including Slice, Resolve, Coalesce, and Diff logicDevelop optimised Azure Cosmos DB data models, including partitioning, indexing, change feed processing, and point-read optimisationIntegrate external APIs for entity resolution and matching services (PermID / IAAS) with robust retry and batching mechanismsDesign

Requirements

publication pipelines to convert bi-temporal data into uni-temporal outputs and publish via Microsoft Fabric / Parquet-based lakehouse architecturesImplement data quality frameworks using Great Expectations to ensure accuracy and complianceBuild robust unit and integration tests using PyTest for PySpark and Cosmos DB componentsSupport and maintain CI/CD pipelines (GitLab CI) including Python packaging, Artifactory deployment, and ARM-based infrastructure provisioningWork with YAML-driven configuration for mastering rules, schemas, and environment setupMonitor and troubleshoot production pipelines using Eventstream telemetry, KQL, and DataDog observability toolsDeliver scalable transformation logic, optimised aggregations, and high-performance data processing workflowsImplement data governance controls including data masking, role-based access, and compliance policiesContinuously tune and optimise workloads for performance, cost efficiency, and reliabilityRequired Skills & ExperienceStrong experience in Python and PySpark (Spark SQL, DataFrame API, Structured Streaming)Hands-on experience building large-scale ETL / streaming data pipelinesExperience working with Azure Cosmos DB (NoSQL) including data modelling and performance tuningStrong knowledge of Azure Data Lake Storage (ADLS / OneLake / ABFS)Experience implementing bi-temporal or SCD Type 2 data modelsStrong understanding of data quality frameworks (e.g., Great Expectations)Experience with CI/CD pipelines (GitLab / Azure DevOps) and automated deploymentsStrong testing discipline using PyTest, mocking, and integration testing approachesExperience working with YAML/JSON configuration and infrastructure-as-code (ARM templates)Strong understanding of distributed data processing and Spark-based architecturesExperience working with financial or time-series datasets (market data, reference data, risk data preferred)Strong communication skills and ability to work with cross-functional stakeholdersDesirable ExperienceMicrosoft Fabric (Notebooks, Eventstream, Lakehouses, Spark Job Definitions)Financial instrument/reference data (ISIN, CUSIP, LEI, PermID)Entity resolution / matching systems and enrichment APIsDelta Lake and Change Data Feed (CDF)Cosmos DB performance optimisation (RU tuning, bulk operations, concurrency)Jinja2 templating or code generation approachesSonarQube or similar code quality toolingMonorepo development with modern Python packaging tools (uv / Hatchling)Power BI / semantic modelling experienceKnowledge of financial compliance standards (GDPR, SOX)Technology StackPython 3.11+, PySpark 3.5, Spark SQL Azure Cosmos DB, ADLS, OneLake, Delta Lake, Parquet Microsoft Fabric (Eventstream, Notebooks, Lakehouse) Great Expectations, LSEG Data Validation frameworks GitLab CI/CD, JFrog Artifactory, ARM Templates DataDog, Eventstream, KQL monitoring Azure Key Vault, Azure CLI, Fabric APIsWhy JoinWork on a global financial markets transformation programmeHands-on with next-generation Azure + Fabric data platformsExposure to bi-temporal modelling and financial instrument mastering systemsHigh-impact engineering role with modern cloud and streaming architectureOpportunity to work with leading domain and technical experts in a regulated environment

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all