Data Engineer
Role details
Job location
Tech stack
Job description
We see a Data Engineer as a software engineer who specialises in distributed data systems. You'll join the Data Engineering team, whose prime responsibility is the development and operation of the Data Collection Hub, a platform that ingests data from many sources, processes/enriches it, and distributes it to multiple downstream systems., * Engineer distributed ingestion services that reliably pull data from diverse sources, handle messy real-world edge cases, and deliver clean, well-structured outputs to multiple downstream products.
- Build high-throughput processing components (batch and/or near-real-time) with a focus on performance, scalability, and predictable cost, using strong profiling and measurement practices.
- Design and evolve data contracts (schemas, validation rules, versioning, backward compatibility) so downstream teams can build with confidence.
- Own production quality: write maintainable code, strong unit/integration tests, and add the observability you need (metrics/logs/tracing) to diagnose issues quickly.
- Improve platform reliability by hardening pipelines against partial failures, retries, rate limits, data drift, and infrastructure issues-then codify those learnings into better tooling and guardrails.
- Contribute to CI/CD and developer experience: faster builds, better test signal, safer releases, and automated operational checks.
- Participate in design reviews, code reviews, incident retrospectives, and iterative delivery-making pragmatic trade-offs and documenting them clearly.
Technology Stack
- Languages: Predominantly Python and Node.js
- Distributed/data platforms: HDFS, HBase, Spark, plus increasing use of Kubernetes and cloud services
- Storage/search: MongoDB, OpenSearch
- Orchestration: Airflow, Dagster, NiFi
- Tooling: GitHub, GitHub Actions, Rundeck, Jira, Confluence
- Deployment/config: Ansible (physical), Terraform / Argo CD / Helm (Kubernetes)
- Development environment: MacBook (typical)
Requirements
Do you have experience in Terraform?, We're looking for someone with 2+ years of industry experience building and operating production software who enjoys working across data pipelines, distributed systems, and operational reliability., Essential:
- 2+ years building and operating production software systems
- Fluency in at least one programming language (Python/Node.js a plus)
- Experience debugging moderately complex systems and improving reliability/performance
- Strong fundamentals: data structures, testing, version control, Linux basics
Nice to have:
- Spark/PySpark experience
- Hadoop ecosystem exposure (HDFS/HBase)
- Workflow orchestration (Airflow/Dagster/NiFi)
- Search/indexing (OpenSearch, MongoDB)
- Kubernetes and infrastructure-as-code
- Degree in Computer Science or numerical degree
Benefits & conditions
-
Competitive salary DOE
-
25 days annual leave + your birthday off, in addition to bank holidays, rising to 30 days after 5 years of service.
-
Remote working
-
Private Family Healthcare.
-
35 hour working week.
-
Employee Assistance Programme.
-
Company contributions to your pension.
-
Pension salary sacrifice.
-
Enhanced maternity/paternity pay.
-
The latest tech including a top of the range MacBook Pro.