Data Engineer Kafka & Hadoop - Inside IR35 - Hybrid

Tenth Revolution Group

Sheffield, United Kingdom

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

£ 124K

Job location

Sheffield, United Kingdom

Tech stack

Batch Processing

Computer Programming

Data Transformation

Serialization

Distributed Systems

Document-Oriented Databases

Protocol Buffers

Hadoop

JSON

Python

Performance Tuning

Data Streaming

Systems Integration

Parquet

Data Logging

Feature Engineering

Spark

Data Lineage

Avro

Real Time Data

Kafka

Splunk

Data Pipelines

Job description

Design and develop Kafka-based streaming applications using Kafka Streams and/or ksqlDB in Scala and Python for data transformation, enrichment, and routing.
Build and maintain end-to-end streaming pipelines, including producers, stream processors, and consumers, with robust handling for data quality, idempotency, and dead-letter queue (DLQ) patterns.
Define and manage topics, schemas, and data contracts using Avro, Protobuf, or JSON, ensuring backward and forward compatibility.
Develop batch and streaming interoperability using Spark/Structured Streaming for aggregation, feature engineering, and storage in columnar formats such as Parquet and ORC.
Integrate processed data into analytics and observability platforms (eg, Splunk) to enable dashboards, alerting, and proactive insights.
Build automated validation, replay, and backfill mechanisms to ensure pipeline reliability and adherence to SLAs.
Implement observability for streaming systems, including metrics, distributed tracing, and structured logging; optimize for performance and cost.
Collaborate with platform and infrastructure teams responsible for Kafka cluster administration, while owning application-layer streaming logic.
Ensure security and compliance across data pipelines, including authentication/authorization, encryption in transit and at rest, and secure secret management.
Document data flows, schemas, architecture decisions, and operational runbooks for streaming services.

Requirements

We are seeking a highly skilled Senior Streaming Data Engineer to design, build, and operate scalable, Real Time data streaming applications. You will play a critical role in developing Kafka-based pipelines, ensuring high data quality, reliability, and performance, while enabling downstream analytics and observability use cases., * Kafka Development: Strong experience with Kafka Streams and/or ksqlDB, including producer/consumer patterns, partitioning strategies, serialization, and delivery semantics (exactly-once, at-least-once).

Programming: Proficiency in Scala and/or Python, with experience in testing frameworks and CI/CD pipelines for streaming applications.
Schema Management: Hands-on experience with schema formats (Avro, Protobuf, JSON), schema registries, and compatibility strategies.
Stream & Batch Processing: Expertise in Apache Spark (including Structured Streaming), file formats (Parquet/ORC), and performance optimization techniques (partitioning, bucketing).
Data Reliability: Strong understanding of idempotent processing, DLQs, replay/backfill strategies, data lineage, and SLA-driven design.
Observability: Experience implementing monitoring, logging, and tracing for distributed systems and integrating with alerting/dashboards.
Security & Compliance: Knowledge of authentication/authorization mechanisms, TLS/SASL, and secrets management best practices.
Collaboration & Communication: Ability to work closely with platform and infrastructure teams, with strong documentation and communication skills.

About the company

Tenth Revolution Group are the go-to recruiter for Data & AI roles in the UK offering more opportunities across the country than any other recruitment agency. We're the proud sponsor and supporter of SQLBits, Power Platform World Tour, and the London Fabric User Group. We are the global leaders in Data & AI recruitment.