Data Engineer
Role details
Job location
Tech stack
Job description
interface for manual uploads and pipeline monitoring, ensuring seamless operations, high data integrity, and informed decision-making. What you do Implement data pipelines in line with established engineering standards - pipeline design patterns, naming conventions, and modelling approaches (e.g. Data Vault). Apply architecture decisions made at Lead level and translate them into working, maintainable solutions. Build and maintain end-to-end data pipelines across ingestion, transformation, and loading. In SimpliFi terms, this means developing and owning flows across Raw Data Vault, Business Data Vault, UJT, and CORE layers - working hands-on with tools such as Databricks, Azure Synapse and IDMC. Responsible for data quality within their pipeline scope - implementing checks, identifying issues early, and resolving them before they propagate downstream. Escalate systemic or cross-domain issues to the Lead. Review code from junior and mid-level engineers, share best practices and contribute
Requirements
to raising team capability. Act as a day-to-day technical reference for less experienced colleagues. Work alongside Data Modellers, MDM/RDM specialists, and IT component leads. Translate technical constraints and findings into clear inputs for planning and design discussions. Ensure pipelines are built in line with data governance requirements - lineage, classification, and access controls. Support gate processes (e.g. DAB/CCP) by providing accurate technical evidence and documentation. Actively participate in sprint planning and delivery cycles - providing effort estimates, flagging technical dependencies, and keeping work moving within their delivery lane. What you bring Hands-on expertise building and operating large-scale pipelines - batch and streaming. Proficiency in tools like Apache Spark, Databricks and cloud-native equivalents. Experience with ETL/ELT patterns at enterprise scale. Practical experience with modern lakehouse/warehouse platforms - Databricks, Azure Synapse, or equivalents. Understanding of partitioning, clustering, query optimisation. Deep working knowledge of at least one major cloud Azure - compute, storage, networking, managed services. In enterprise contexts like SimpliFi, Azure is typically dominant (ADLS, ADF, Azure Databricks). Solid grasp of modelling paradigms - Medallion, dimensional (Kimball) and Data Vault 2.0. Ability to review and contribute to models, not just implement them. Know how to implement metadata capture, data lineage (e.g. via IDMC or Purview) and access controls. Experience navigating governance frameworks and gate processes. Strong SQL (complex transformations, performance tuning), Python (pipeline logic, data quality scripts) and familiarity with PySpark for distributed workloads. Experience implementing DQ rules, profiling and monitoring - ideally with tools like DQX, Great Expectations. Understanding of master and reference data flows - how golden records are created, maintained and consumed. Experience with platforms like SAP MDG or Informatica MDM is a strong plus. CI/CD for data pipelines (GitHub Actions), version control discipline and infrastructure-as-code basics (Terraform). Able to read and contribute to architecture documents and understand layers, zones and data flow patterns across a platform. Can engage meaningfully with architects without needing hand-holding. Familiarity with Machine learning is a strong advantage. What we offer We offer a hybrid work model which recognizes the value of striking a balance between in-person collaboration and remote working incl. up to 25 days per year working from abroad. We believe in rewarding performance and our compensation and benefits package includes a company bonus scheme, pension, employee shares program and multiple employee discounts (detai