Principal Vector Data Engineer

Johnson & Johnson, S.a.
Municipality of Madrid, Spain
12 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Municipality of Madrid, Spain

Tech stack

Artificial Intelligence
Software Documentation
Databases
Data Governance
Data Integration
Data Structures
Python
TensorFlow
Signal Processing
Software Requirements Analysis
PyTorch
Indexer
Information Technology
HuggingFace
Data Pipelines
GXP

Job description

The Principal Vector Data Engineer is a technical and strategic leader operating at the intersection of AI, digital health, and therapeutic R&D. This role leads the development of multimodal vector embedding pipelines and foundation model architectures supporting longitudinal data integration, disease progression modeling, and digital biomarker discovery across Neuroscience, Oncology, and Immunology. The successful candidate guides enterprise-scale vectorization efforts while ensuring compliance with clinical, regulatory, and GxP data standards. Key Responsibilities - Technical Leadership

  • Lead the design, development, and optimization of vector embedding models for diverse biomedical modalities including clinical, regulatory, imaging (MRI, PET), and digital health data.
  • Architect scalable, compliant embedding pipelines using modern vector database technologies (FAISS, Pinecone, Weaviate, Milvus, Chroma, etc.).
  • Establish robust quality-control frameworks for mobile-captured images and convert pixel-level data into high-fidelity vector representations.
  • Drive the adaptation of state-of-the-art academic methods into production-ready, GxP-aware foundation models.
  • Oversee multimodal data integration efforts to enable semantic search, retrieval-augmented analysis, and clinical insight generation.

Key Responsibilities - Cross-Functional & Regulatory Leadership

  • Collaborate with data scientists, clinicians, engineering teams, and regulatory/QA partners to ensure models and data pipelines align with GxP, clinical governance, and documentation standards.
  • Contribute to digital biomarker discovery and predictive modeling for neurodegenerative, neuropsychiatric, oncologic, and immunologic conditions.
  • Mentor junior engineers and contribute to technical roadmap planning, architectural reviews, and AI strategy development., * Enterprise biomedical data transformed into vectorized, interoperable assets powering scientific AI and semantic intelligence.
  • Improved data governance, lineage, and GxP alignment across foundation models and vector pipelines.
  • Accelerated discovery of digital biomarkers and predictive patterns across therapeutic areas.
  • Scalable vector infrastructure enabling next-generation clinical and translational AI research.

Requirements

  • MS/PhD in Computer Science, Electrical Engineering, Biomedical Engineering, or related discipline.
  • 3+ years of experience in multimodal ML, vector representation learning, biomedical signal processing, or large-scale embedding systems.
  • Expertise in Python, PyTorch/TensorFlow, Hugging Face, and multimodal embedding architectures (CLIP, MedCLIP, BioBERT, TimeSformer, etc.).
  • Hands-on experience with vector indexing/search systems (FAISS, Pinecone, Weaviate, Milvus, Odrant, Chroma).
  • Familiarity with sentence-transformers, LangChain, or LlamaIndex for semantic search and RAG workflows.
  • Understanding of clinical trial data structures, longitudinal monitoring, GxP system requirements, and compliant data lifecycle management., Below are listed among qualification requirements and therefore not duplicated here. All relevant skills are listed under "Qualifications".

About the company

Johnson & Johnson believes health is everything. Our strength in healthcare innovation empowers us to build a world where complex diseases are prevented, treated, and cured, where treatments are smarter and less invasive, and solutions are personal. Through our expertise in Innovative Medicine and MedTech, we are uniquely positioned to innovate across the full spectrum of healthcare solutions today to deliver the breakthroughs of tomorrow and profoundly impact health for humanity.

Apply for this position