Data Engineer (NL26)

Barone, Budge & Dominick (Pty) Ltd

Amsterdam, Netherlands

3 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

Amsterdam, Netherlands

Tech stack

Java

API

Artificial Intelligence

Amazon Web Services (AWS)

Azure

Big Data

Cloud Computing

Computer Programming

Continuous Integration

Information Engineering

Data Files

Data Governance

Data Infrastructure

Data Integration

Data Integrity

ETL

Data Structures

Data Systems

Data Warehousing

DevOps

Distributed Systems

Github

Identity and Access Management

Python

Raw Data

Azure

SQL Databases

Data Streaming

Tableau

Parquet

Data Processing

Scripting (Bash/Python/Go/Ruby)

Azure

Large Language Models

Snowflake

Spark

Generative AI

Backend

Data Lake

PySpark

Gitlab-ci

Data Lineage

Avro

Real Time Data

Machine Learning Operations

Functional Programming

Cloudwatch

Azure

Stream Analytics

Looker Analytics

Software Version Control

Data Pipelines

Serverless Computing

Databricks

Job description

BBD is looking for a skilled Data Engineer to design, build and maintain scalable data pipelines and architectures. You will play a pivotal role in enabling data-driven decision-making by ensuring our data infrastructure is robust, secure and efficient. You will work with modern tools and cloud platforms (AWS, Azure, Databricks) to transform raw data into actionable insights, supporting both traditional analytics and emerging AI/ML workloads. Responsibilities

Pipeline development: Design, build and maintain efficient, reliable and scalable ETL/ELT pipelines using Python, SQL, and Spark
Architecture & modelling: Implement modern data architectures (e.g., Data Lakehouse, Medallion Architecture) and data models to support business reporting and advanced analytics
Cloud infrastructure: Manage and optimise cloud-based data infrastructure on AWS and Azure, ensuring cost-effectiveness and performance
Data governance: Implement data governance, security and quality standards (e.g., using Great Expectations, Unity Catalog) to ensure data integrity and compliance
Collaboration: Work closely with Data Scientists, AI Engineers and Business Analysts to understand data requirements and deliver high-quality datasets
MLOps support: Collaborate on MLOps practices, supporting model deployment and monitoring through robust data foundations
Continuous improvement: Monitor pipeline performance, troubleshoot issues, and drive automation using CI/CD practices

Requirements

A minimum of 5 years of professional experience, with at least 2 years of experience with Databricks
Programming & scripting:Strong proficiency inPythonfor data manipulation and scripting. Experience with Scala or Java is a plus
Big Data processing:Extensive experience withApache Spark (PySpark)for batch and streaming data processing
Workflow orchestration:Proficiency withApache Airflowor similar tools (e.g., Prefect, Dagster, Azure Data Factory) for scheduling and managing complex workflows
Data warehousing: Proficiency in modern cloud data warehouses such as Snowflake, including designing, modelling and optimising analytical data structures to support reporting, BI and downstream analytics
Expert SQL skills for analysis and transformation
Deep understanding ofBig Data file formats(Parquet, Avro, Delta Lake)
Experience designingData Lakesand implementing patterns like theMedallion Architecture(Bronze/Silver/Gold layers)
Streaming:Experience with real-time data processing usingKafkaor similar streaming platforms
DevOps & CI/CD:
Proficiency withGitfor version control
Experience implementing CI/CD pipelines for data infrastructure (e.g., GitHub Actions, GitLab CI, Azure DevOps)
Familiarity with data quality frameworks likeGreat Expectationsor Soda
Understanding of data governance principles, security, and lineage
Reporting & visualisation:Experience serving data to BI tools likePower BI,Tableau, or Looker
AI/ML familiarity:Exposure to Generative AI concepts (LLMs, RAG, Vector Search) and how data engineering supports them
Storage:Deep knowledge ofAmazon S3for data lake storage, including lifecycle policies and security configurations
ETL & orchestration:Hands-on experience withAWS Glue(Crawlers, Jobs, Workflows, Data Catalog) for serverless data integration
Governance:Experience withAWS Lake Formationfor centrally managing security and access controls
Streaming:Proficiency withAmazon Kinesis(Data Streams, Firehose) for collecting and processing real-time data
Core services:Solid understanding of core AWS services (IAM, Lambda, EC2, CloudWatch) relevant to data engineering

Other

Storage: Deep knowledge of Azure Data Lake Storage (ADLS) Gen2 and Blob Storage
ETL & orchestration: Experience with Azure Data Factory (ADF) or Azure Synapse Analytics pipelines for data integration and orchestration
Governance: Familiarity with Microsoft Purview for unified data governance and Microsoft Entra ID (formerly Azure AD) for access management
Streaming: Proficiency with Azure Event Hubs or Azure Stream Analytics for real-time data ingestion
Core Services: Understanding of core Azure services (Resource Groups, VNets, Azure Monitor) relevant to data solutions
Platform management: Experience managing Databricks Workspaces, clusters, and compute resources
Governance: Proficiency with Unity Catalog for centralised access control, auditing, and data lineage
Development:
Building and orchestrating Databricks Jobs and Delta Live Tables (DLT) pipelines
Deep knowledge of Delta Lake features (time travel, schema enforcement, optimisation)
AI & ML integration:
Experience with MLflow for experiment tracking and model registry
Exposure to Mosaic AI features (Model Serving, Vector Search, AI Gateway) and managing LLM workloads on Databricks

Required certifications

AWS: AWS Certified Solutions Architect - Associate
Microsoft: Microsoft Certified: Azure Solutions Architect Expert
Databricks: (details omitted)

Internal candidate profile

We are open to training internal candidates who demonstrate strong engineering fundamentals and a passion for data. Ideal internal candidates might currently be in the following roles:

Python Back-end Engineer: Strong coding skills (Python) and experience with APIs / back-end systems, looking to specialise in big data processing and distributed systems
DevOps Engineer: Coding background with strong infrastructure-as-code and CI/CD skills, interested in applying those practices specifically to data pipelines and MLOps

About the company

BBD is an international custom software solutions company that solves real-world problems with innovative solutions and modern technology stacks. With extensive experience across various sectors and a wide array of technologies, BBD's core services encompass digital enablement, software engineering and solutions support, which includes cloud engineering, data science, product design and managed services. Over the past 40 years, we have built a reputation for hiring the best talent and collaborating with client teams to deliver exceptional value through software. As the company has grown, this unwavering commitment to quality and continuous innovation has ensured clients get the full benefit from software that meets their unique environment. The culture BBD's culture is one that encourages collaboration, innovation and inclusion. Our relaxed yet professional work environment extends into a flat management structure. At BBD, you are not just a number, but a valuable member of the team, working with like-minded, passionate individuals on challenging projects in interesting spaces. We deeply believe in the importance of each individual taking control of their career growth, with the support, encouragement and guidance of the company. We do this for every BBDer, creating the space and opportunity to continue learning, growing and expanding their skillsets. We also proudly support and ensure diverse project teams as varied perspectives will always make for stronger solutions. With hubs in 7 cities, we have mastered distributed development and support a flexible, hybrid working environment. Our hubs are also a great place to get to know people, share knowledge, and enjoy snacks, great coffee and catered lunches as well as social, sport and cultural gatherings. Lastly, recognition is deeply ingrained in the BBD culture and we use every appropriate opportunity to show this through our Awards Nominations, shoutouts and of the course the exceptional bonuses that come from exceptional performance.