Azure & Databricks Data Engineer for Anonymized Data Marts
Role details
Job location
Tech stack
Requirements
Overview We are looking for a Data Engineer in Barcelona to design, build, deploy, and maintain ETL/ELT pipelines and anonymized data marts on a cloud-based data platform focused on Azure, Databricks, Lakehouse architecture, data marts, and data anonymization. Responsibilities Design, build, deploy, and maintain ETL/ELT pipelines in a cloud environment using Azure and Databricks. Develop anonymized data marts across multiple source applications on the corporate data platform. Work with Azure Data Lake Storage, Azure Data Factory, Databricks, Apache Spark, Delta Lake, and PySpark. Build robust and scalable data solutions following Lakehouse and Medallion Architecture principles. Optimise Spark jobs, Databricks clusters, workflows, and jobs. Apply strong data modelling practices to design clean, reusable, and analytics-ready datasets. Implement and validate data anonymization processes according to quality and compliance standards. Produce complete technical documentation covering data mart structure, anonymization methodology, and data management guidelines. Prepare implementation reports with evidence demonstrating the correct functionality of anonymization processes. Troubleshoot production issues and ensure reliable data pipeline performance. Collaborate with other Data Engineers, analysts, and business stakeholders. Contribute to development best practices, version control, CI/CD, clean code, and maintainable documentation. Contribute to Lakehouse Architecture and Medallion Architecture best practices. Build data solutions with a strong focus on anonymization, compliance, documentation, and data quality. Develop robust, scalable, and production-ready data pipelines. Design and deliver anonymized data marts used across multiple source applications. Qualifications At least 4 years of experience as a Data Engineer. Proven experience developing, deploying, and maintaining ETL/ELT pipelines in cloud environments. Hands-on experience with Azure data products, especially Azure Data Lake Storage and Azure Data Factory. Strong knowledge of Databricks, including Apache Spark, Databricks workflows/jobs, cluster optimisation, and job optimisation. Experience working with Delta Lake. Knowledge of Medallion Architecture. Proven experience with data anonymization. Strong programming skills with Python and PySpark. Solid understanding of development best practices, Git, version control, and CI/CD. Experience troubleshooting production issues and building robust data solutions. Strong focus on data quality, clean code, maintainability, and documentation. Good level of English and Spanish. Proactive mindset, ownership, teamwork, and strong problem-solving skills. Experience working within a Lakehouse Architecture. Knowledge of Scala. Experience with other cloud platforms such as GCP or AWS. Experience building ETLs for third-party APIs. Familiarity with data governance. Experience creating technical documentation for data platforms, anonymization protocols, or compliance-related processes. Leadership skills to help motivate and align team efforts. Work Arrangement Hybrid model - 2 days onsite per week. Flexible working hours. Benefits Continuous learning opportunities. Private health insurance and benefits package. Wellhub: fitness, wellness, and mental health support. Football and paddle tennis teams sponsored by Capitole. Team buildings, global events, and strong tech communities. Information Security Notice The employee will have access to confidential information related to Capitole and the assigned project. Compliance with internal security and information protection policies is mandatory.