Lead Software Engineer - AI Quality & Test Automation
Role details
Job location
Tech stack
Job description
As Lead Software Engineer - AI Quality & Test Automation, you own the strategy and execution of a modern, AI-accelerated test automation program. You build and run the platforms, frameworks, and CI quality gates that keep releases reliable at high velocity - using AI to scale test creation, maintenance, and triage across teams. You don't retrofit testing onto AI workflows; you engineer quality into them from the ground up, ensuring every stage of an AI-assisted pipeline is observable, trustworthy, and continuously improving., * Own the automation platform: define and build scalable frameworks, standards, and governance (services, APIs, UI) that enable teams to produce robust, consistent coverage
- Set direction for test automation tooling and AI-assisted techniques that accelerate test design, authoring, maintenance, and triage
- Lead by example - write automation, set the quality bar, mentor engineers, and drive adoption of the practices you establish
- Own CI automation health: build and maintain test environments, pipelines, and tooling (including containers/CI), reduce flakiness, and shorten feedback loops
- Be a highly visible advocate for quality - lead cross-team discussions, drive alignment and decisions, and escalate risks and blockers immediately
AI-Native Test Engineering
- Lead the adoption of LLM-powered test generation: from natural language requirement ingestion to executable, maintainable test output
- Build and maintain a self-healing test infrastructure layer - leveraging AI to detect broken selectors, drifted APIs, or changed behaviors and propose or apply fixes autonomously
- Define prompt engineering standards, context injection patterns, and RAG architectures that ground test generation in real codebase context
- Implement guardrails to ensure AI-generated test output is verified, traceable, and safe to ship - including versioning, ownership attribution, and confidence scoring
- Own automated coverage and risk reporting (unit/integration/e2e) and use it to drive targeted gap closure and release readiness
Quality Gates & CI/CD
- Lead risk-based test strategy with product and engineering - define acceptance criteria and quality gates that support high delivery velocity without sacrificing customer-impacting quality
- Design adaptive quality gates for AI-accelerated CI/CD pipelines - gates that reason about risk, not just pass/fail thresholds
- Build risk-scoring models that adjust gate strictness based on change scope, code origin (human vs. AI-generated), historical failure patterns, and deployment context
- Architect the observability layer for automated pipelines: surface signals that indicate poor quality decisions in real time
- Establish rollback and circuit-breaker patterns for autonomous deployments triggered by quality signal degradation
AI Model & Agent Validation
- Build behavioral testing frameworks for validating AI agents and LLM-powered features in production - testing non-deterministic outputs with statistical rigor
- Design evaluation benchmarks for internal AI tooling: measuring task completion accuracy, hallucination rates, and decision quality over time
- Define adversarial and edge-case testing methodologies for AI features: prompt injection resistance, boundary condition handling, and graceful degradation
- Partner with ML platform and data science teams to establish quality acceptance criteria for every model and agent promoted to production
Requirements
- 10+ years of software engineering experience with a strong emphasis on test automation (unit, integration, end-to-end) and quality engineering practices
- Strong programming skills in one or more languages: Python, TypeScript, Java, or C#
- Platform mindset - energized by building infrastructure and frameworks that enable product teams to move faster
- Experience building automation for APIs and distributed systems; UI automation experience is a plus
- Experience with CI/CD, test reporting/observability, and maintaining reliable pipelines (e.g., GitHub Actions, Jenkins, Azure DevOps)
- Proven ability to apply AI/LLM tools to scale automation (e.g., generate/refine tests, expand edge cases, refactor brittle suites, accelerate failure triage) while implementing guardrails so AI output is verified, traceable, and safe to ship
- Hands-on understanding of agentic AI patterns: tool use, multi-agent orchestration, planning loops, and human-in-the-loop design as applied to quality workflows
- Familiarity with LLM failure modes relevant to quality: hallucination, context loss, sycophancy, and over-confident assertions
- Excellent debugging and root-cause analysis skills across code, data, and infrastructure
- Strong communication skills; able to translate quality risks into clear tradeoffs and action plans
- Demonstrated technical leadership across teams - driving standards, roadmap execution, and stakeholder alignment
- Self-directed, comfortable with ambiguity, and biased toward action
Desired Skills & Experience
- Modern test frameworks (e.g., pytest, JUnit, NUnit, Playwright, Cypress) and API testing (REST, gRPC)
- Containerization and environments: Docker; Kubernetes a plus
- Relational databases and SQL; ability to validate data pipelines and analytics outputs
- Experience with Elasticsearch or similar text-retrieval data stores
- Performance/load testing (e.g., JMeter, k6, Locust) and profiling/observability
- Cloud experience (AWS/Azure/Google Cloud Platform) and Infrastructure-as-Code (e.g., Terraform)
- Experience building AI-enabled automation workflows (e.g., Claude/OpenAI APIs, prompt patterns for test generation, repo-aware RAG, scripts/services that turn AI output into runnable tests)
- Familiarity with agent frameworks (LangChain, LlamaIndex, AutoGen, or equivalents) and their tradeoffs in production quality pipelines
- Experience designing evaluation harnesses for non-deterministic AI systems - statistical confidence, behavioral consistency, and regression detection
- Agile software development experience (Scrum / XP)
Education
Bachelor's degree in Computer Science or a related field, or equivalent practical experience.