AI Platform Engineer
Go Arrow
11 days ago
Role details
Contract type
Temporary contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Compensation
£ 116KJob location
Tech stack
Artificial Intelligence
Amazon Web Services (AWS)
Azure
Information Engineering
Data Security
DevOps
Machine Learning
Management of Software Versions
Pulumi
Graphics Processing Unit (GPU)
Google Cloud Platform
Data Ingestion
Cloudformation
Machine Learning Operations
Terraform
Data Pipelines
Job description
We are looking for a skilled AI Platform Engineer to design, build, and scale the infrastructure that powers our machine learning and AI capabilities. You will be responsible for developing the core platforms and services that enable data scientists, ML engineers, and researchers to efficiently train, deploy, and monitor AI models at scale., * Platform Design & Development
- Architect and build scalable AI/ML platforms for model training, experimentation, and deployment.
- Develop reusable tools and frameworks that streamline the ML lifecycle (data ingestion, training, validation, deployment, monitoring).
- Implement self-service capabilities for data scientists and ML engineers to run experiments and deploy models efficiently.
- Infrastructure & Automation
- Build and maintain cloud-native infrastructure (AWS, GCP, Azure) optimized for AI workloads.
- Automate infrastructure provisioning using Infrastructure-as-Code tools (Terraform, CloudFormation, Pulumi).
- Develop CI/CD pipelines for ML and AI applications, ensuring reproducibility and compliance.
- Model Deployment & Serving
- Design robust systems for model versioning, packaging, and serving (using tools like MLflow, Kubeflow, Seldon, or KServe).
- Optimize model inference performance and scalability using GPUs/TPUs and distributed frameworks.
- Integrate model monitoring and feedback loops for continuous improvement.
- Data & Experimentation Management
- Support efficient data access and lineage tracking across teams and environments.
- Implement experiment tracking systems for hyperparameter tuning and results analysis.
- Work with Data Engineering teams to ensure reliable data pipelines and governance.
- Collaboration & Governance
- Partner with Data Scientists, ML Engineers, DevOps, and Security teams to ensure seamless workflows and compliance.
- Establish platform governance policies, including access control, resource quotas, and auditability.
- Champion best practices for MLOps, scalability, and cost optimization.
Requirements
Do you have experience in Terraform?