Lead Data Scientist

Xebia
Taramundi, Spain
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Taramundi, Spain

Tech stack

Google BigQuery
Cloud Computing
Data Governance
Data Systems
Data Flow Control
Python
Scientific Computating
SQL Databases
Data Processing
PyTorch
Kubernetes
Data Management
Docker

Job description

As a Data Scientist at Xebia, you will work closely with engineering, product, and data teams to deliver our clients scalable and robust data solutions. Your key responsibilities will include designing, building, and maintaining data platforms and pipelines and mentoring new engineers., * Develop, maintain, and optimize data science models using Python and modern ML frameworks.

  • Build and support Next-Generation Sequencing (NGS) data processing pipelines.
  • Analyze large-scale biological datasets, applying statistical and computational techniques.
  • Implement cloud-based data processing workflows using GCP services such as GCS, Cloud Run, GKE, BigQuery, and Dataflow.
  • Design scalable data workflows using Nextflow and other orchestration tools.
  • Collaborate with bioinformatics specialists and engineering teams to translate scientific requirements into technical solutions.
  • Support model evaluation, validation, and deployment in cloud environments.
  • Ensure best practices around reproducibility, documentation, and data quality.

Requirements

  • Strong proficiency in Python, including pydantic, PyTorch, and pandas.
  • Hands-on experience designing or maintaining NGS pipelines.
  • Experience with GCP (GCS, Cloud Run, GKE, BigQuery, Dataflow).
  • Solid understanding of SQL and data querying.
  • Practical experience using Nextflow for workflow definition and orchestration.
  • Background in bioinformatics, including handling genomic data and biological datasets.
  • Ability to work collaboratively with cross-functional stakeholders.
  • Excellent communication skills in English (written and spoken).

Nice to have

  • Experience operationalizing ML models in cloud or containerized environments.
  • Familiarity with Docker or Kubernetes. xcskxlj
  • Understanding of data governance, reproducible research, and scientific computing standards.

Apply for this position