Gen AI Engineer

Skywaves MP LLC

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Remote

Tech stack

Java

API

Amazon Web Services (AWS)

Artificial Neural Networks

Azure

Signals Intelligence

Computational Linguistics

Data Deduplication

Data Fusion

Web Scraping

Relational Databases

Document Retrieval

Entity Relationship Models

Graph Database

Information Sciences

JSON

Python

Link Analysis

Neo4j

Open Source Technology

Open Source Intelligence

Software Engineering

SPARQL

Systems Integration

Unstructured Data

Datadog

Google Cloud Platform

Cloud Platform System

Data Ingestion

Large Language Models

Multi-Agent Systems

Prompt Engineering

Generative AI

Kotlin

Knowledge Representation

Containerization

Kubernetes

Information Technology

Low Latency

Playwright

Kafka

Search Engines

GraphQL

Machine Learning Operations

Virtual Agents

Docker

Microservices

Job description

Knowledge Graph Engineering

Design, build and maintain large-scale property graphs and RDF triplestores (Neo4j, Amazon Neptune, Stardog, or equivalent).
Develop and govern ontologies, taxonomies, and entity-relationship schemas that reflect real-world domain semantics.
Implement graph ingestion pipelines that extract, transform, and link entities from structured, semi-structured, and unstructured data.
Optimise graph traversal queries (Cypher, SPARQL, Gremlin) for sub-second response at production scale.
Train and deploy graph neural networks (GNNs) for node classification, link prediction, and subgraph retrieval - Maintain model retraining workflows triggered by graph drift or coverage degradation.

Agentic AI Systems

Architect and implement autonomous agents that plan multi-step reasoning chains over knowledge graph data using LLMs (GPT-4o, Claude, Gemini, or open-source equivalents).
Build graph-aware Retrieval-Augmented Generation (RAG) pipelines that blend structured graph context with unstructured document retrieval.
Design tool-use and function-calling layers so agents can query live data sources - web search, REST/GraphQL APIs, relational databases - to extend or verify graph knowledge.
Implement agent memory, reflection, and self-correction loops to improve reliability over multi-hop tasks.

Context Enrichment & Data Fusion

Integrate web scraping, news feeds, and open-source intelligence (OSINT) sources to keep the knowledge graph current.
Build entity resolution and deduplication components that merge data from heterogeneous sources into a consistent graph.
Develop confidence-scoring and provenance-tracking mechanisms so downstream consumers understand the reliability of any piece of context.

MLOps & Production Readiness

Package agents as scalable microservices; instruments with observability tooling (tracing, latency, token cost).
Collaborate with platform engineers to deploy workloads on cloud-native infrastructure (AWS / Google Cloud Platform / Azure).
Maintain evaluation harnesses that measure agent accuracy, hallucination rate, and graph coverage over time.

Requirements

7-10 years of professional software engineering with strong Python (or Java / Kotlin) proficiency.
2+ Yrs, Hands-on production experience with at least one major graph database - Neo4j, Amazon Neptune, TigerGraph, or comparable.
Demonstrated knowledge of graph query languages like Cypher, SPARQL, or Gremlin - at production query complexity.
Direct experience building LLM-powered agents or pipelines using frameworks such as LangChain, LangGraph, LlamaIndex, CrewAI, AutoGen, or Semantic Kernel.
Solid understanding of RAG architectures: chunking strategies, vector stores (Pinecone, Weaviate, pgvector), hybrid retrieval, and re-ranking.
Familiarity with prompt engineering, few-shot learning, and LLM evaluation techniques.
Experience integrating external data sources via APIs, web scraping (Playwright / Scrapy), or streaming pipelines (Kafka / Kinesis).
Working knowledge of containerisation (Docker, Kubernetes) and CI/CD pipelines.
Familiarity with graph export formats - at least one GraphML, RDF/OWL, or JSON-LD
Experience integrating GNN-derived features into vector stores or RAG pipelines, * Advanced degree (MS / PhD) in Computer Science, Information Science, Computational Linguistics, or a related field.
Experience in intelligence, defence, or trade-craft environments - working with OSINT, link analysis, entity disambiguation, or signals intelligence data.
Understanding of access-control models for sensitive graph data (need-to-know, compartmentalisation, provenance labelling).
Familiarity with knowledge representation standards like OWL, SHACL, RDF-star, JSON-LD, W3C PROV.
Experience with fine-tuning or instruction-tuning open-source LLMs (Llama, Mistral, Falcon) for domain-specific tasks.
Background in network-analysis algorithms: centrality, community detection, path-finding, anomaly detection on graphs.
Contributions to open-source graph or GenAI projects; published research or technical blog presence.
Active or adjudicatable security clearance (Secret or above) - strongly preferred for trade-craft assignments.