Data Engineer Kafka & Hadoop - Inside IR35 - Hybrid
Tenth Revolution Group
Sheffield, United Kingdom
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Senior Compensation
£ 124KJob location
Sheffield, United Kingdom
Tech stack
Batch Processing
Computer Programming
Data Transformation
Serialization
Distributed Systems
Document-Oriented Databases
Protocol Buffers
Hadoop
JSON
Python
Performance Tuning
Data Streaming
Systems Integration
Parquet
Data Logging
Feature Engineering
Spark
Data Lineage
Avro
Real Time Data
Kafka
Splunk
Data Pipelines
Job description
- Design and develop Kafka-based streaming applications using Kafka Streams and/or ksqlDB in Scala and Python for data transformation, enrichment, and routing.
- Build and maintain end-to-end streaming pipelines, including producers, stream processors, and consumers, with robust handling for data quality, idempotency, and dead-letter queue (DLQ) patterns.
- Define and manage topics, schemas, and data contracts using Avro, Protobuf, or JSON, ensuring backward and forward compatibility.
- Develop batch and streaming interoperability using Spark/Structured Streaming for aggregation, feature engineering, and storage in columnar formats such as Parquet and ORC.
- Integrate processed data into analytics and observability platforms (eg, Splunk) to enable dashboards, alerting, and proactive insights.
- Build automated validation, replay, and backfill mechanisms to ensure pipeline reliability and adherence to SLAs.
- Implement observability for streaming systems, including metrics, distributed tracing, and structured logging; optimize for performance and cost.
- Collaborate with platform and infrastructure teams responsible for Kafka cluster administration, while owning application-layer streaming logic.
- Ensure security and compliance across data pipelines, including authentication/authorization, encryption in transit and at rest, and secure secret management.
- Document data flows, schemas, architecture decisions, and operational runbooks for streaming services.
Requirements
We are seeking a highly skilled Senior Streaming Data Engineer to design, build, and operate scalable, Real Time data streaming applications. You will play a critical role in developing Kafka-based pipelines, ensuring high data quality, reliability, and performance, while enabling downstream analytics and observability use cases., * Kafka Development: Strong experience with Kafka Streams and/or ksqlDB, including producer/consumer patterns, partitioning strategies, serialization, and delivery semantics (exactly-once, at-least-once).
- Programming: Proficiency in Scala and/or Python, with experience in testing frameworks and CI/CD pipelines for streaming applications.
- Schema Management: Hands-on experience with schema formats (Avro, Protobuf, JSON), schema registries, and compatibility strategies.
- Stream & Batch Processing: Expertise in Apache Spark (including Structured Streaming), file formats (Parquet/ORC), and performance optimization techniques (partitioning, bucketing).
- Data Reliability: Strong understanding of idempotent processing, DLQs, replay/backfill strategies, data lineage, and SLA-driven design.
- Observability: Experience implementing monitoring, logging, and tracing for distributed systems and integrating with alerting/dashboards.
- Security & Compliance: Knowledge of authentication/authorization mechanisms, TLS/SASL, and secrets management best practices.
- Collaboration & Communication: Ability to work closely with platform and infrastructure teams, with strong documentation and communication skills.
About the company
Tenth Revolution Group are the go-to recruiter for Data & AI roles in the UK offering more opportunities across the country than any other recruitment agency. We're the proud sponsor and supporter of SQLBits, Power Platform World Tour, and the London Fabric User Group. We are the global leaders in Data & AI recruitment.