Ontology / Knowledge Graph Engineer (Life Sciences)
Role details
Job location
Tech stack
Job description
- Define schemas and ontologies for scientific information.
- Validate mapping specifications to ensure industrialization.
- Convert business needs into defined deliverable requirements.
Conocimientos
Knowledge Graph development Entity modeling Schema governance Programming (Python) Open-source ontology tools Data governance Semantic web technologies, * Define schemas, ontologies, and data models for scientific information needed for value-adding data products, including quality control and mapping specifications to be industrialized by data engineering.
- Validate and verify mapping specifications to ensure they are industrialized by data engineering and maintained in platform tooling.
- Convert business needs into defined deliverable requirements to enable integration of large-scale biology data for drug and vaccine discovery.
- Collaborate with external groups to align data standards with industry and academic ontologies, ensuring usage/analytics focus.
- Provide subject-matter expertise to translate deep science into data for actionable insights.
- Maintain documentation of data standards, ontology decisions, and mapping rationale for knowledge transfer and auditability.
Requirements
The ideal candidate will have a Master's degree in a relevant field, with over 6 years of experience in Knowledge Graph development and strong programming skills in Python. Familiarity with life-science ontologies and open-source tools is essential for success., * 6+ years of relevant work experience.
- Hands-on experience with open-source ontology tools.
- Knowledge of major life-science ontologies., Master's degree in Bioinformatics/Biomedical Science/Bioengineering/Molecular Biology/Computer Science, * Master's degree in Bioinformatics, Biomedical Science, Biomedical Engineering, Molecular Biology, Computer Science (with a life-science focus).
- 6+ years of relevant work experience.
- Experience in Knowledge Graph development, entity modeling, relationship design, and schema governance.
- Hands-on experience with open-source ontology tools and languages: Protégé, SPARQL, OWL, SKOS, SHACL, RML, RDF/Turtle.
- Knowledge of major life-science ontologies: Gene Ontology, OBO Foundry ontologies (CL, UBERON, HPO, MONDO, CHEBI, EFO, CLO), MeSH, SNOMED CT, UMLS.
- Familiarity with linked data principles and semantic web technologies.
- Experience with industry-standard data serialization protocols (JSON Schema, LinkML).
- Proficiency in at least one programming language, preferably Python, for scripting vocabulary mappings, building data models, automating QC, and prototyping pipelines.
Preferred Qualifications
- Experience with data governance and data quality tooling (e.g., Ataccama, Informatica, Talend, OpenRefine, Great Expectations, dbt).
- Experience supporting LLM integration or AI-readiness workflows (metadata enrichment, entity linking, embedding pipelines, RAG).
- Understanding of vector databases for semantic search (Weaviate, Chroma).
- Familiarity with cloud data platforms (AWS, GCP, Azure) and graph database technologies (Neo4j, Amazon Neptune, Stardog, GraphDB, TigerGraph).