Senior Data Engineer

Hack The Box
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote

Tech stack

Artificial Intelligence
Airflow
Data analysis
BigTable
Google BigQuery
Software as a Service
Cloud Computing
Cloud Storage
Continuous Integration
ETL
Dimensional Modeling
Data Flow Control
Github
Python
Machine Learning
Online Analytical Processing
Standard Sql
Software Engineering
SQL Databases
Data Streaming
Workflow Management Systems
Google Cloud Platform
Feature Engineering
Snowflake
Spark
Build Management
Debezium
Kubernetes
Apache Flink
Production Code
Kafka
Spark Streaming
Machine Learning Operations
Vertica
REST
Apache Beam
Docker

Job description

Your day-to-day will include designing ELT/ETL processes on BigQuery and ClickHouse, building real-time pipelines on Pub/Sub and Kafka with Dataflow (and where it fits, Flink/Spark), orchestrating workflows with Airflow, and ensuring data is properly cleaned, modelled, and served for analytics, ML training, and online inference. You'll partner with ML engineers on feature pipelines, monitoring data drift, and keeping models well-fed and retrained as needed. You'll consume and build REST APIs, integrate with third-party SaaS sources, and treat infrastructure as code., You will be part of the Data, Analytics & AI team, collaborating closely with Infrastructure, Software Engineering, Product, and ML/AI engineers. We're in the middle of a GCP-native modernisation - migrating away from Snowflake toward BigQuery, Bigtable, Pub/Sub, and Dataflow - so we're looking for someone who's opinionated about clean architecture, allergic to over-engineering, and comfortable owning systems end-to-end. If retiring a legacy warehouse and standing up its replacement sounds like a good time, you'll fit right in.

  • ? Technology tools & weapons you'll be using:

  • Cloud & warehouse: GCP, BigQuery, Bigtable, Cloud Storage

  • Streaming & messaging: Pub/Sub, Kafka

  • Processing: Dataflow (Apache Beam), with Flink/Spark where appropriate

  • Orchestration: Airflow (Cloud Composer)

  • Analytical store: ClickHouse

  • Languages: Python, SQL

  • Modelling & quality: dbt, data quality gates

  • Containers & CI/CD: Docker, Kubernetes, GitHub Actions / equivalent

  • Legacy (being retired): Snowflake

The adventures that await you after becoming Senior Data Engineer at Hack The Box:

  • Design and build batch and streaming pipelines on Dataflow, Pub/Sub, and Kafka feeding BigQuery, Bigtable, and ClickHouse

  • Help drive the migration off Snowflake onto our GCP-native stack - and retire shadow pipelines along the way

  • Own the orchestration layer in Airflow, including SLAs, retries, and data quality gates

  • Model data for analytics and for ML - including feature pipelines that serve both training and low-latency online inference

  • Partner with ML engineers on feature stores, drift monitoring, and retraining workflows

  • Capture requirements from stakeholders and translate them into pragmatic, well-scoped data products

  • Continuously improve data quality, reliability, observability, and cost efficiency

  • Identify new data sources worth acquiring and integrate them cleanly, * You'll have the exhilarating opportunity to contribute to a product that is highly appreciated by users and the cybersecurity community at large

  • You'll experience a highly supportive and caring environment, fostering growth, flexibility, and autonomy

  • You'll embark on an exciting journey of continuous learning and problem-solving, leveling up as our organization grows

  • Most importantly, you'll have a blast at HTB because fun is an essential ingredient in our recipe for success! Just wait until you see our global meet-ups!, Our benefits package is designed to provide strong support to our team, but it may vary depending on location and type of employment (e.g., UK, Greece, or engagement through an Employer of Record). ? The Quest of Becoming Hack The Box's Senior Data Engineer:

  • Level 1: To complete level one's objective, submit your application.

  • Level 2: Meet the Talent Acquisition team. Level's objective: highlight your past achievements, ambitions, and values.

  • Level 3: Meet the hiring team. Level's objective: connect with the hiring team and share with them your achievements.

  • Level 4: Complete 2 assignments that align with day-to-day job-related tasks and responsibilities. Part of the assignment is discussing it with the hiring team in a debriefing session, in order to walk the team through your thinking process.

  • Level 5: Congratulations! Not many reach this level . Level's objective: have a constructive, final conversation with senior leadership to explore the role and your future at HTB.

  • Level 6: You've officially received an offer from HTB! To complete the last level and the Quest, all you need to do is accept the offer. Quest complete. Congratulations, you're officially one of us Your next quest: complete the onboarding.

Requirements

  • Strong data modelling and warehouse architecture skills (dimensional modelling, event-driven, lakehouse patterns)
  • Hands-on experience with GCP data services - BigQuery is a must; Pub/Sub, Dataflow, Bigtable, Cloud Composer are strong pluses
  • Production experience with streaming pipelines on Dataflow/Beam, Flink, or Spark Structured Streaming, ingesting from Kafka and/or Pub/Sub
  • Solid SQL and strong Python - you write production-quality code, not just notebooks
  • Experience with ClickHouse or another columnar OLAP engine in production
  • Workflow orchestration experience with Airflow (or Prefect/Dagster)
  • Comfortable with dbt or equivalent transformation frameworks
  • Experience migrating off legacy warehouses (Snowflake, Redshift, Synapse) onto cloud-native stacks is a plus
  • Working knowledge of ML in production - feature engineering, feature stores, model deployment, drift monitoring, retraining
  • Docker & Kubernetes experience
  • CI/CD mindset, infrastructure-as-code sensibility, and a bias for simple, observable systems
  • Bonus: CDC tooling (Datastream, Debezium), Vertex AI / Feature Store

Benefits & conditions

  • Private health care
  • Paid paternity leave
  • 25 annual leave days
  • Free lunch & snacks at the office
  • 120€ Ticket Restaurant by Edenred
  • Dedicated budget for training and professional development, participation in conferences
  • Full access to the Hack The Box lab offerings; so you can learn how to hack
  • State-of-the-art equipment (mac, iPhone, and mobile plan)
  • Flexible WFH (Hybrid Model) - Fully Remote is also an option if you're not an Attica resident

About the company

Hack The Box is the Cyber Performance Center with the mission to provide a human-first platform to create and maintain high-performing cybersecurity individuals and organizations. Hack The Box is the only platform that unites upskilling, workforce development, and the human focus in the cybersecurity industry, and it's trusted by organizations worldwide for driving their teams to peak performance. Offering an all-in-one environment for continuous growth, assessment, and recruitment, Hack The Box provides solutions for all cybersecurity domains. Launched in 2017, Hack The Box brings together the largest global cybersecurity community of more than 3 million platform members. Rapidly growing its international footprint and reach, Hack The Box is headquartered in the UK, with additional offices in the US, Australia, and Greece. Exciting News: * Get the most important updates on HTB's latest year! * We are super proud to share that HTB's all three entities across the UK, US, and Greece have been Certified as a Great Place to Work (Oct 2023-Oct 2024). * Furthermore, in 2024 the HTB's Greek entity has been listed by the Great Place to Work Institute as the #2 Best Workplace in Greece and #10 Best Workplace in Europe (among Small & Medium Workplaces). * Take a sneak peek at how it is to be part of HTB and our 2023 Global Retreat. Get more insights about our HTB culture and employee experience by visiting the "about us" section of our site, our career site, and Glassdoor. At Hack The Box, we are committed to fostering a diverse, inclusive, and equitable workplace. We believe that diversity enriches our performance, services, and the communities we serve. As such, we ensure that all job applications are considered solely based on merit, skills, and qualifications. We do not discriminate on grounds of race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. We are dedicated to providing a fair and respectful work environment that reflects our values. True

Apply for this position