Senior Software Development Engineer In Test, ML/AI
Role details
Job location
Tech stack
Job description
We are hiring a Sr Software Engineer in Test specializing in ML/AI quality, including automation for model evaluation, LLM-assisted test generation, and validation of AI-powered workflows. The role includes guiding quality strategy and developing automation frameworks. You will also manage implementation for complex, cross-functional, machine learning-powered products and services. This role involves more than test development. You will lead quality initiatives from start to finish, ensuring teams stay synchronized, dependencies are managed, and releases are delivered confidently. You will work closely with ML, engineering, product, and infrastructure teams. You will shape how quality is built, monitored, and expanded while leading embedded QE efforts within projects., * Define and complete quality strategies, test plans, and automation coverage for ML-powered services and platform components.
- Use LLMs and other AI-assisted techniques to generate, expand, and maintain high-value test cases for ML-powered workflows.
- Design scenario-based test suites for AI features, including adversarial prompts, edge cases, ambiguous inputs, and underrepresented user scenarios.
- Lead QE efforts for multi-functional projects, driving risk assessment, dependency management, and release readiness.
- Design, develop, and maintain scalable automation frameworks for backend services, APIs, and ML inference systems using Python and/or Java.
- Build automated validation for ML and LLM outputs, including ranking behavior, score distributions, prompt/response quality, hallucination indicators, and probabilistic model evaluation.
- Debug test failures, service anomalies, model inconsistencies, and AI behavior regressions to identify root causes and drive resolution.
- Perform functional, integration, regression, API, end-to-end, performance, and reliability testing for distributed systems.
- Improve automation reliability, reduce flakiness, and optimize execution efficiency.
- Partner with engineering and ML teams to integrate automated testing into CI/CD pipelines and release workflows.
- Collaborate across teams to establish scalable quality standards, tooling, and guidelines.
Requirements
Do you have experience in Tooling?, Do you have a Bachelor's degree?, * Bachelor's degree in Computer Science or equivalent
- 5+ years of experience as an SDET or QE engineer focused on backend and distributed systems.
- Experience using LLMs to generate, transform, and prioritize test cases for AI-powered experiences.
- Experience with AI evaluation tooling, prompt evaluation frameworks, model monitoring, or human-in-the-loop review workflows.
- Strong experience testing RESTful APIs, microservices, and distributed architectures.
- Proficiency in Python, Java, JS or similar languages for automation development.
- Hands-on experience with automation frameworks such as pytest, JUnit, Selenium, Playwright, Cypress, or Appium.
- Experience with CI/CD systems and test pipelines (Jenkins, GitHub Actions, etc.).
- Experience with cloud and container technologies (AWS, GCP, Kubernetes, Docker).
- Familiarity with databases, monitoring, and observability tools.
- Strong understanding of SDLC, Agile methodologies, and release processes.
- Excellent problem-solving, debugging, and communication skills., * Experience validating ML outputs using statistical analysis or scenario-based testing approaches.
- Familiarity with ML infrastructure, data pipelines, or model-serving platforms (Seldon, KServe, Ray Serve, etc.).
- Prior work in content moderation ML, security, fraud detection, or adversarial ML.
- Experience testing high-scale, low-latency online services.
- Experience with Databricks or similar ML platform tooling.
- Familiarity with Node.js, React, or modern frontend technologies.
- Experience testing mobile, console, or other non-PC platforms.
What sets you apart
- Strong combination of automation engineering and delivery ownership.
- Ability to drive quality across complex cross-functional initiatives.
- Practical understanding of how to test non-deterministic AI systems and separate model variance from quality regressions.
- Proven risk management and dependency coordination skills.
- Ability to influence engineering teams and promote quality guidelines.
- Passion for scalable, reliable, and maintainable automation systems.