Gen AI Engineer
Role details
Job location
Tech stack
Job description
Knowledge Graph Engineering
-
Design, build and maintain large-scale property graphs and RDF triplestores (Neo4j, Amazon Neptune, Stardog, or equivalent).
-
Develop and govern ontologies, taxonomies, and entity-relationship schemas that reflect real-world domain semantics.
-
Implement graph ingestion pipelines that extract, transform, and link entities from structured, semi-structured, and unstructured data.
-
Optimise graph traversal queries (Cypher, SPARQL, Gremlin) for sub-second response at production scale.
-
Train and deploy graph neural networks (GNNs) for node classification, link prediction, and subgraph retrieval - Maintain model retraining workflows triggered by graph drift or coverage degradation.
Agentic AI Systems
- Architect and implement autonomous agents that plan multi-step reasoning chains over knowledge graph data using LLMs (GPT-4o, Claude, Gemini, or open-source equivalents).
- Build graph-aware Retrieval-Augmented Generation (RAG) pipelines that blend structured graph context with unstructured document retrieval.
- Design tool-use and function-calling layers so agents can query live data sources - web search, REST/GraphQL APIs, relational databases - to extend or verify graph knowledge.
- Implement agent memory, reflection, and self-correction loops to improve reliability over multi-hop tasks.
Context Enrichment & Data Fusion
- Integrate web scraping, news feeds, and open-source intelligence (OSINT) sources to keep the knowledge graph current.
- Build entity resolution and deduplication components that merge data from heterogeneous sources into a consistent graph.
- Develop confidence-scoring and provenance-tracking mechanisms so downstream consumers understand the reliability of any piece of context.
MLOps & Production Readiness
- Package agents as scalable microservices; instruments with observability tooling (tracing, latency, token cost).
- Collaborate with platform engineers to deploy workloads on cloud-native infrastructure (AWS / Google Cloud Platform / Azure).
- Maintain evaluation harnesses that measure agent accuracy, hallucination rate, and graph coverage over time.
Requirements
- 7-10 years of professional software engineering with strong Python (or Java / Kotlin) proficiency.
- 2+ Yrs, Hands-on production experience with at least one major graph database - Neo4j, Amazon Neptune, TigerGraph, or comparable.
- Demonstrated knowledge of graph query languages like Cypher, SPARQL, or Gremlin - at production query complexity.
- Direct experience building LLM-powered agents or pipelines using frameworks such as LangChain, LangGraph, LlamaIndex, CrewAI, AutoGen, or Semantic Kernel.
- Solid understanding of RAG architectures: chunking strategies, vector stores (Pinecone, Weaviate, pgvector), hybrid retrieval, and re-ranking.
- Familiarity with prompt engineering, few-shot learning, and LLM evaluation techniques.
- Experience integrating external data sources via APIs, web scraping (Playwright / Scrapy), or streaming pipelines (Kafka / Kinesis).
- Working knowledge of containerisation (Docker, Kubernetes) and CI/CD pipelines.
- Familiarity with graph export formats - at least one GraphML, RDF/OWL, or JSON-LD
- Experience integrating GNN-derived features into vector stores or RAG pipelines, * Advanced degree (MS / PhD) in Computer Science, Information Science, Computational Linguistics, or a related field.
- Experience in intelligence, defence, or trade-craft environments - working with OSINT, link analysis, entity disambiguation, or signals intelligence data.
- Understanding of access-control models for sensitive graph data (need-to-know, compartmentalisation, provenance labelling).
- Familiarity with knowledge representation standards like OWL, SHACL, RDF-star, JSON-LD, W3C PROV.
- Experience with fine-tuning or instruction-tuning open-source LLMs (Llama, Mistral, Falcon) for domain-specific tasks.
- Background in network-analysis algorithms: centrality, community detection, path-finding, anomaly detection on graphs.
- Contributions to open-source graph or GenAI projects; published research or technical blog presence.
- Active or adjudicatable security clearance (Secret or above) - strongly preferred for trade-craft assignments.