SDET - Test Automation AI (Agentic & LLM Systems) - Perm - Circa £85K - 2 days a week in Glasgow
Role details
Job location
Tech stack
Job description
We are seeking an SDET - Test Automation AI (Agentic & LLM Systems) to define and implement assurance approaches for AI-enabled systems. The role focuses on ensuring that AI solutions are reliable, robust, explainable, secure, and fit for purpose across their full lifecycle.
The AI Assurance Engineer will assure both:
- Probabilistic components (data, models, and AI outputs)
- Deterministic components (software, integrations, and infrastructure)
and will embed assurance into automated, end-to-end delivery pipelines.
This role requires a strong understanding of how to assure AI systems holistically, rather than deep specialism in a single discipline., AI Assurance Strategy
- Define and implement an AI assurance approach aligned to business risk and regulatory expectations.
- Provide assurance coverage across the full AI system lifecycle (design, build, deploy, operate).
- Work with engineering, data, and product teams to embed quality and risk controls early.
Probabilistic Component Assurance
- Design validation approaches for:
- Data quality and bias
- Model and prompt behaviour
- Output accuracy, relevance, and consistency
- Implement evaluation methods for:
- Drift and instability
- Hallucination and error patterns
- Support human-in-the-loop review where required.
Deterministic Component Assurance
- Assure non-AI system elements including:
- Application logic and workflows
- APIs and integrations
- Security and access controls
- Design and execute:
- Functional testing
- Non-functional testing (performance, resilience, scalability)
- Security and data protection validation
Automation & E2E Assurance
- Design & build automated assurance for AI systems.
- Integrate assurance into CI/CD and deployment pipelines.
- Implement regression and quality gates across data, models, and orchestration workflows.
- Maintain an end-to-end assurance pipeline from input data through to system outputs.
Operational AI & Observability
- Support monitoring and observability for AI-enabled systems in production.
- Analyse operational signals such as:
- Latency and failures
- Behaviour changes
- Performance degradation
- Contribute to incident analysis and continuous improvement of AI services.
Governance, Risk & Reporting
- Define and track AI quality and risk metrics (accuracy, robustness, explainability).
- Support compliance with:
- Data protection and privacy requirements
- Responsible AI principles
- Produce clear assurance evidence for technical and non-technical stakeholders.
Requirements
Do you have experience in Python?, Do you have a Master's degree?, Core
- Strong software engineering background (Python or similar).
- Experience building automated test or validation frameworks.
- Experience working with complex distributed or cloud-based systems.
AI & Probabilistic Systems
- Understanding of:
- Data quality and bias
- Model behaviour and non-deterministic outputs
- Prompt-based or agent-based systems
- Experience validating correctness, consistency, and relevance of AI outputs.
Deterministic Systems & Non-Functional Testing
- Experience testing:
- APIs and workflows
- Cloud services
- Knowledge of:
- Performance testing
- Security testing
- Resilience and failure handling
Operational AI (MLOps / AIOps Awareness)
- Familiarity with:
- Model lifecycle management
- CI/CD for AI systems
- Monitoring and drift detection
- Understanding of production risks associated with AI systems.