Data Engineer

Randstad
Malvern, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 132K

Job location

Malvern, United States of America

Tech stack

Audit Trail
Databases
Data Validation
ETL
Data Retention
Relational Databases
Database Queries
Python
Operational Databases
Migration Manager
SQL Databases
Data Layers
Database Migration
InfluxDB
Data Pipelines

Job description

We need a data engineer who has designed and operated production data pipelines - not just written queries, but owned the storage layer, migration strategy, and transformation pipelines that other components depend on. The work involves migrating a live system's data layer to a production-grade, domain-appropriate store, designing transformation pipelines that normalize disparate sources into clean schemas, and building data quality instrumentation that catches failures before they surface as bad output.

  • Python proficiency is required. The transformation pipelines are Python and this engineer must contribute to them at a production quality level., What We Need:

  • Schema design

Has designed relational database schemas from scratch - tables, column types, normalization strategy, primary and foreign keys, and upfront indexing decisions. Has also inherited a schema, audited it, and documented what is actually used versus what is vestigial.

  • Live database migration

Has migrated a production database without taking the system offline. Can describe a dual-write migration, how to validate data consistency at each stage, what the rollback plan looks like, and what failure modes they have actually encountered - not just read about.

Requirements

3-5 years writing Python at a production quality level - engineered pipelines with structured error handling, retry logic, observability, and lifecycle management. Not data science notebooks. Must read and extend existing Python components confidently at the same quality standard as the rest of the team.

  • SQL fluency

Can write complex queries with joins, window functions, CTEs, and subqueries. Can read a query plan, identify why a query is slow, design the right index for a specific access pattern, and reason about cost under volume.

  • Transformation pipeline design

Has built ETL or ELT pipelines that take heterogeneous data sources, validate and normalize them, and load them into a target schema. Understands schema contracts: what happens when an upstream source changes shape, how you detect it, and how you prevent silent data corruption downstream.

  • Data quality instrumentation

Has added data quality checks to a production pipeline - completeness rates, freshness metrics, schema conformance validation - and configured alerting when those metrics degrade. Has experienced a silent data quality failure in production and can describe how they eventually detected it.

Robust preference for: Experience with time-series databases (TimescaleDB, InfluxDB, or equivalent). Has worked in a regulated environment with data retention and audit trail requirements as non-negotiable engineering constraints.

qualifications:

What We Need:

  • Schema design

Has designed relational database schemas from scratch - tables, column types, normalization strategy, primary and foreign keys, and upfront indexing decisions. Has also inherited a schema, audited it, and documented what is actually used versus what is vestigial.

  • Live database migration

Has migrated a production database without taking the system offline. Can describe a dual-write migration, how to validate data consistency at each stage, what the rollback plan looks like, and what failure modes they have actually encountered - not just read about.

  • Production Python

3-5 years writing Python at a production quality level - engineered pipelines with structured error handling, retry logic, observability, and lifecycle management. Not data science notebooks. Must read and extend existing Python components confidently at the same quality standard as the rest of the team.

  • SQL fluency

Can write complex queries with joins, window functions, CTEs, and subqueries. Can read a query plan, identify why a query is slow, design the right index for a specific access pattern, and reason about cost under volume.

  • Transformation pipeline design

Has built ETL or ELT pipelines that take heterogeneous data sources, validate and normalize them, and load them into a target schema. Understands schema contracts: what happens when an upstream source changes shape, how you detect it, and how you prevent silent data corruption downstream.

  • Data quality instrumentation

Has added data quality checks to a production pipeline - completeness rates, freshness metrics, schema conformance validation - and configured alerting when those metrics degrade. Has experienced a silent data quality failure in production and can describe how they eventually detected it.

Robust preference for: Experience with time-series databases (TimescaleDB, InfluxDB, or equivalent). Has worked in a regulated environment with data retention and audit trail requirements as non-negotiable engineering constraints.

Apply for this position