Scientific Knowledge Engineer, Ontology & Data Modeling

Descripción De La Vacante

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

€ 80K

Job location

Tech stack

Artificial Intelligence

Amazon Web Services (AWS)

Azure

Bioinformatics

Information Engineering

Data Governance

Serialization

Graph Database

JSON

Python

Knowledge Management

Linked Data

Neo4j

Web Ontology Language

Open Source Technology

Cloud Services

Search Technologies

Semantic Web

SPARQL

Talend

Scripting (Bash/Python/Go/Ruby)

Google Cloud Platform

Large Language Models

Information Technology

Programming Languages

Job description

JSON Schema Neo4j Descripción del empleo Scientific Knowledge Engineer, Ontology & Data Modeling

This role is responsible for maximizing the value of our data assets over a lifetime to bring purpose to data by acting as translators of highly technical information from domain experts into an appropriate data model - complete with significant ontology and vocabulary - that can be utilized to effectively structure and index the data. Specifically, the engineer works with Product managers and R&D subject matter expertise to define the language (data models, ontology, standards, etc.) of science into data products by acting as the voice of the "Knowledge base" and the interoperability/value of the asset. Key Responsibilities

Definition of schemas/ontology and data models of scientific information required for the creation of value-adding data products. This includes accountability for the quality control and mapping specifications to be industrialized by data engineering and maintained in platform-provisioned tooling.
Accountable for the quality control (through validation and verification) of mapping specifications to be industrialized by data engineering and maintained in platform-provisioned tooling - e.g., models, schemas, controlled vocab.
Working with Product managers/engineers confidently converting business needs into defined deliverable business requirements to enable the integration of large-scale biology data to predict, model, and stabilize therapeutically relevant protein complex and antigen conformations for drug and vaccine discovery.
Collaborate with external groups to align data standards with industry/academic ontologies ensuring that data standards are defined with usage/analytics in mind.
Provide bespoke subject-matter expertise for R&D data to translate deep science into data for actionable insights.
Contribute to and maintain documentation of data standards, ontology decisions, and mapping rationale to support organizational knowledge transfer and auditability.

Requirements

Masters degree in a relevant field (Bioinformatics, Biomedical Science, etc.).
6+ years of relevant experience in Knowledge Graph development.
Hands-on experience with ontology tools and languages., * Masters degree in Bioinformatics, Biomedical Science, Biomedical Engineering, Molecular Biology, or Computer Science (with a life science application focus).
6+ years of relevant work experience.
Specific experience contributing to Knowledge Graph development efforts, including entity modeling, relationship design, and schema governance.
Hands-on experience with open-source ontology tools and languages: Protégé, SPARQL, OWL, SKOS, SHACL, RML, RDF/Turtle.
Working knowledge of major life sciences ontologies: Gene Ontology (GO), OBO Foundry ontologies (CL, UBERON, HPO, MONDO, CHEBI, EFO, CLO), MeSH, SNOMED CT, UMLS.
Familiarity with linked data principles and semantic web technologies.
Experience with industry-standard tools for building data serialization protocols (e.g., JSON Schema, LinkML).
Proficiency in at least one programming language - preferably Python - for scripting vocabulary mappings, building data models, automating QC, and prototyping pipelines., * Experience with data governance and data quality tooling (e.g., Ataccama, Informatica, Talend, OpenRefine, Great Expectations, dbt).
Experience with at least one programming language - e.g., Python - for scripting vocabulary mappings, building data models, etc.
Experience supporting LLM integration or AI-readiness workflows - including metadata enrichment, entity linking, embedding pipelines, or retrieval-augmented generation (RAG) architectures.
Understanding of vector databases and their role in semantic search and knowledge retrieval (e.g., Weaviate, Chroma).
Familiarity with cloud data platforms and infrastructure relevant to large-scale biological data (e.g., AWS, GCP, Azure).
Familiarity with graph database technologies (e.g., Neo4j, Amazon Neptune, Stardog, GraphDB, TigerGraph).

About the company

Xebia is seeking a Scientific Knowledge Engineer in Barcelona to maximize the value of data assets. This role entails defining schemas, ensuring quality control of data models, and collaborating with external groups. Candidates should have a Master's degree in Bioinformatics or a related field, with over 6 years of experience, specializing in Knowledge Graphs, and possess strong skills in ontology tools and programming languages like Python. Join Xebia in this innovative role.

Role details

Job location

Tech stack

Job description

Requirements

About the company

Apply for this position

Good distractions

Moments

Videos View all