ML/AI Software Engineer New
Role details
Job location
Tech stack
Job description
We are seeking a Senior ML/AI Engineer to lead the design, optimization, and deployment of large-scale automation efforts, utilizing LLMs and generative AI as appropriate. This role sits at the intersection of machine learning, backend engineering, and cloud-scale infrastructure, focusing on building intelligent systems that power our teams.
The ideal candidate has hands-on experience building scalable, automated systems. You combine backend engineering expertise with applied AI knowledge, using technologies like LangChain, vector databases, and LLM APIs alongside microservices, Kubernetes, Terraform, and CI/CD pipelines to create resilient, intelligent systems. You'll partner closely with devOps, data science, and data engineering to deploy scalable, reliable, and cost-efficient automation solutions that increase efficiencies, and accelerates innovation across the organization., Automate Workflows: Architect, build, test, and monitor AWS-based workflows to solve critical business problems
Microservices and APIs: Develop microservices for ML-driven applications using Python or Java, ensuring scalability and resilience.
Service Availability: Guarantee high levels of service availability through participation in an on-call rotation, following best practices for disaster recovery and business continuity.
Automated Deployment: Ensure all work is deployed in an automated, repeatable fashion, optimizing infrastructure for cost and efficiency.
Requirements
Educational Background: Bachelor's degree with 6+ years of experience in machine learning, backend engineering, or AI platform development.
Coding Proficiency: Demonstrated experience with Java and Python.
Cloud Competency: Proficiency with common AWS services or equivalent such as EKS/ECS, Kinesis, Lambda, DynamoDB, SNS, and SQS.
Systems Monitoring and Analytics: Knowledge of systems monitoring, alerting, and analytics using tools such as Datadog, Splunk, New Relic, or AWS CloudTrail.
Communication Skills: Demonstrated success in cross-functional collaboration.
LLM Expertise: Proven experience developing, evaluating and orchestrating agentic workflows
MLOps and Distributed Systems: Hands-on experience with distributed systems and MLOps tooling such as Kubernetes, Docker, MLflow, Airflow, Terraform, and CI/CD.
Preferred Skills
Data Streaming and Orchestration: Familiarity with tools such as Kafka, Flink, Spark, dbt, and Airflow.
Multi-Modal LLM Systems: Experience with multi-modal LLM systems (text + image embeddings).
AI Observability and Evaluation: Exposure to AI observability, prompt evaluation frameworks, and safety alignment tools.
Vector Database Knowledge: Understanding of vector databases and retrieval workflows.
Benefits & conditions
- Training Provided
- Regular team and company events
- Free drinks, fruit or food
- Flexible working
- Free Gym or Gym Subsidy
- Private Medical/Dental healthcare
- Bonus/Reward Scheme
- Cycle to work scheme
- Game Jams