Data Integration Engineer

Boehringer Ingelheim España, S.A.
Sant Cugat del Vallès, Spain
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Sant Cugat del Vallès, Spain

Tech stack

Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Apache HTTP Server
Big Data
Cloud Computing
Code Review
Data as a Services
Data Architecture
Information Engineering
Data Governance
Data Infrastructure
Data Integration
ETL
Data Systems
Data Visualization
Data Warehousing
DevOps
Document Management Systems
Python
Machine Learning
NoSQL
Performance Tuning
Power BI
SQL Databases
Data Streaming
Tableau
Parquet
Data Processing
Scripting (Bash/Python/Go/Ruby)
Data Storage Technologies
Snowflake
Spark
Cloudformation
Containerization
Data Lake
Kubernetes
Information Technology
Data Analytics
Kafka
Data Delivery
Data Pipelines
Docker
Jenkins
Databricks

Job description

We are seeking a skilled and motivated Data Engineer to join the IT RDM CI Data Excellence team. In this role, you will play a pivotal part in enhancing our data infrastructure, optimizing data flows, and ensuring the availability and quality of strategically critical data assets from both internal and external providers.

You will enable fast and reliable data delivery to the right environments, supporting cutting-edge use cases in Artificial Intelligence (AI) and Machine Learning (ML). Collaboration is key: you'll work closely with researchers, data scientists, and analysts to build a consistent and scalable data ecosystem across multiple analytics and AI initiatives.

Tasks and responsibilities

  • Design, develop, and maintain scalable data pipelines and ETL processes to support data integration and analytics.
  • Collaborate with data architects, modelers and IT team members to help define and evolve the overall cloud-based data architecture strategy, including data warehousing, data lakes, streaming analytics, and data governance frameworks.
  • Collaborate with integration engineers, analysts, and other business stakeholders to understand data requirements and deliver solutions.
  • Optimize and manage data storage solutions and data integrations (e.g., S3, Snowflake, dbt, Snaplogic) ensuring data quality, integrity, security, and accessibility.
  • Leverage Databricks for scalable data processing, analytics, and advanced transformations.
  • Implement data quality and validation processes to ensure data accuracy and reliability.
  • Develop and maintain documentation for data processes, architecture, and workflows.
  • Participate in code reviews and contribute to best practices for data engineering.
  • Monitor and troubleshoot data pipeline performance and resolve issues promptly.
  • Consulting and Analysis: Meet with defined stakeholders to understand and analyze their processes and needs. Determine requirements to present possible solutions or improvements.
  • Technology Evaluation: Stay updated with the latest trends in data engineering, cloud technologies, and big data platforms.
  • Expert Communities: Engage actively in internal expert groups to exchange knowledge, mentor junior colleagues, and contribute to improving inefficient processes.
  • Cloud-Based Data Solutions: Utilize AWS cloud services (e.g., S3, Lambda, Step function, KMS, …) to support data engineering workflows and develop infrastructure as code for data pipelines using tools like Jenkins and AWS CloudFormation.
  • Performance Optimization: Monitor and optimize data pipelines for performance, scalability, and cost efficiency and troubleshoot and resolve data-related issues in a timely manner.

Requirements

Do you have experience in Tableau?, Do you have a Master's degree?, * Bachelor/Master degree in Computer Science, Engineering, or related field

  • Proficiency with the Apache ecosystem (Parquet, Iceberg, Spark, Kafka, Airflow).
  • Strong hands-on experience with AWS data services (Kinesis, Glue, Appflow, Lambda, S3).
  • Demonstrated experience with Snowflake and dbt (dbt labs) for building and modeling data pipelines
  • Strong analytical skills working with unstructured datasets.
  • Experience with relational SQL and NoSQL databases, preferably Snowflake and/or Databricks.
  • Familiarity with data pipeline and workflow orchestration tools.
  • Strong project management and organizational skills.
  • Excellent English written and verbal communication skills.
  • Snaplogic knowledge is a plus.
  • Preferred Skills:
  • Proficiency in scripting languages such as Python or Scala.
  • Familiarity with data visualization tools (e.g., Tableau, Power BI, QuickSight).
  • AWS Cloud Practitioner, Architecture, Big Data or Data Analytics certification.
  • NICE-TO-HAVE Qualifications : AWS Certified Big Data or AWS Certified Solutions Architect certification, experience with Databricks and Snowflake, knowledge of containerization technologies like Docker and Kubernetes, and experience with CI/CD pipelines and DevOps best practices.

Benefits & conditions

We are continuously working to design the best experience for you. Here are some examples of how we will take care of you:

  • Flexible working conditions
  • Life and accident insurance
  • Health insurance at a competitive price
  • Investment in your learning and development
  • Gym membership discounts

About the company

At Boehringer Ingelheim, we believe that Data & AI have the power to transform healthcare and improve the lives of millions of patients and animals. As a key member of the IT Research Development and Medicine - Computational Innovation team, you will join a passionate group where you will meet and collaborate with like-minded people dedicated to fostering a strong data and AI culture, delivering key transformation initiatives, and shaping the future of data-driven decision-making across our global organization. Your work will empower our researchers to achieve breakthrough therapies for our patients.

Apply for this position