Machine Learning Engineer

Net2Source
Reading, United States of America
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Reading, United States of America

Tech stack

A/B testing
Artificial Intelligence
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Databases
Continuous Integration
Data Governance
DevOps
Amazon DynamoDB
Identity and Access Management
Machine Learning
Large Language Models
Grafana
Multi-Agent Systems
Caching
Generative AI
GIT
Containerization
Kubernetes
Information Technology
Machine Learning Operations
Functional Programming
Dataiku
Cloudwatch
Api Gateway
Docker

Job description

  • Design multi-agent architectures: define agent roles (planner, researcher, retriever, executor, reviewer), toolboxes, handoffs, memory strategy (short/long-term), and supervisor policies for safe collaboration.
  • Build high-quality RAG: implement ingestion, chunking, embeddings, indexing, and retrieval with evaluation (precision/recall, groundedness, hallucination checks), guardrails, and citations.
  • Productionize on AWS: leverage services like Bedrock (Agents/Knowledge Bases/Flows), Lambda, API Gateway, S3, DynamoDB, OpenSearch/Vector DB, Step Functions, and CloudWatch for tracing and alerts.
  • MLOps/LLMOps: automate CI/CD (GitOps), containerization (Docker/Kubernetes), infra-as-code, secrets/IAM, blue green/rollbacks, and data/feature pipelines.
  • Observability & evaluation: instrument telemetry (traces, token/cost, latency), build dashboards (Grafana/CloudWatch), add human-in-the-loop review, A/B testing, and continuous offline/online evals.
  • Operate reliably at scale: implement caching, rate-limit management, queueing, idempotency, and backoff; proactively detect drift and degradation.
  • Collaborate & communicate partner with infra/DevOps/data/architecture teams; document designs, SLIs/SLOs, runbooks; present status and insights to technical and non-technical stakeholders.

Requirements

  • Bachelor's degree in computer science, Data Science, Engineering, or related field-or equivalent experience.
  • Proven experience building agentic systems (single or multi-agent) and RAG pipelines in production.
  • Strong cloud background for AI/ML workloads; familiarity with Bedrock or equivalent LLM platforms.
  • Solid CI/CD and containerization skills (Git, Docker, Kubernetes) and infra-as-code fundamentals.
  • Knowledge of data governance and model accountability throughout the MLOps/LLMOps lifecycle.
  • Excellent communication, collaboration, and problem-solving skills; ability to work independently and within cross-functional teams.
  • Passion for Generative AI and the impact of agent-based solutions across industries.

Preferred / Good to Have

  • Experience with AWS Bedrock Agents/Knowledge Bases/Flows, OpenSearch (or other vector databases), Step Functions, Lambda, API Gateway, DynamoDB, S3.
  • Dataiku platform exposure-govern, approvals, artifacts, MLOps deployment flows; SageMaker for custom model hosting.
  • Familiarity with agent frameworks (e.g., LangGraph, crewAI, Semantic Kernel, AutoGen) and evaluation frameworks (guardrails, groundedness, hallucination checks).
  • Covered these Dataiku Certifications (nice to have): ML Practitioner, Advanced Designer, MLOps Practitioner.

Apply for this position