Founding AI Platform Engineer (MLOps / Backend)

Gamingtec
9 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Junior

Job location

Remote

Tech stack

API
Artificial Intelligence
Software as a Service
Continuous Integration
Software Debugging
Management of Software Versions
Large Language Models
Backend
Machine Learning Operations

Job description

  • Build and maintain the infrastructure and tooling used to train, evaluate, deploy, and monitor ML models and GenAI services;
  • Own production services, APIs, and pipelines that power recommendations, agent workflows, and customer-facing integrations;
  • Improve CI/CD, testing, release workflows, rollback processes, and environment management;
  • Establish observability across service health, model behaviour, agent quality, latency, cost, and failure modes;
  • Build reproducibility and lifecycle practices for models, prompts, datasets, configurations, and releases;
  • Support experimentation and measurement infrastructure so product and ML changes can be evaluated cleanly;
  • Improve reliability, scalability, security, performance, and cost efficiency across the stack;
  • Troubleshoot production issues end-to-end and turn recurring pain points into durable engineering improvements;
  • Help define the platform and engineering standards the company will rely on as it grows.

What Success Looks Like in the First 6 Months:

  • Shipping a model or GenAI change to production becomes faster, safer, and less manual;
  • Core services and AI workflows are observable and easier to debug;
  • The platform supports more usage with better reliability and lower operational friction;
  • Engineers spend less time fighting infrastructure and deployment issues and more time shipping product;
  • You become the person who can see platform, reliability, and scaling risks early and address them before they become problems.

And this is how our interview process goes:

  • A 30-minute interview with a member of our HR team to get to know you and your experience;
  • A 1-hour technical interview;
  • A final interview to gauge your fit with our culture and working style.

Requirements

  • Strong software engineering background with experience building and operating production systems;
  • Experience with backend services, cloud infrastructure, CI/CD, testing, observability, and automation;
  • Strong Python skills and comfort working across services, tooling, infrastructure, and operational workflows;
  • Good judgment about reliability, performance, maintainability, and cost tradeoffs;
  • Ability to collaborate closely with ML and product teams and move ambiguous work to completion;
  • High ownership, attention to detail, and a bias toward simplifying and strengthening systems.

What would be an advantage:

  • Experience with MLOps workflows for model training, evaluation, deployment, and monitoring;
  • Experience serving ML models or LLM applications in production;
  • Experience with experimentation platforms, event pipelines, analytics instrumentation, or feature delivery platforms;
  • Experience with agent evaluation, prompt versioning, retrieval/search infrastructure, or vector-backed systems;
  • Experience supporting customer-facing APIs or SaaS platform infrastructure.

Apply for this position