Senior Cloud Architect, Delivery (GenAI)

DoiT

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

£ 83K

Job location

Remote

Tech stack

Agile Methodologies

Artificial Intelligence

Airflow

Amazon Web Services (AWS)

JIRA

Google BigQuery

Software as a Service

Cloud Computing

Cloud Engineering

Continuous Integration

DevOps

Distributed Computing Environment

Distributed Systems

Identity and Access Management

Machine Learning

Network Segmentation

Open Source Technology

Performance Tuning

Cloud Services

TensorFlow

Systems Integration

Data Logging

Google Cloud Platform

Cloud Platform System

PyTorch

Delivery Pipeline

Large Language Models

Prompt Engineering

State Machines

Model Validation

Multi-Cloud

Generative AI

AWS Lambda

Containerization

AI Platforms

Information Technology

HuggingFace

Amazon Web Services (AWS)

Data Analytics

Amazon Web Services (AWS)

Machine Learning Operations

Cloud Optimization

Cloudwatch

Api Gateway

Terraform

Data Pipelines

Job description

Lead the design and implementation of production-grade ML and Generative AI solutions on AWS (with awareness of multi-cloud environments).
Act as a hands-on expert and trusted advisor for customers running AI/ML workloads at scale, from initial discovery through deployment and optimization.
Translate complex business problems into cloud architectures that are secure, reliable, cost-efficient, and observable.
Help evolve how DoiT uses AI/ML internally and with customers by turning one-off solutions into reusable patterns and "gravel roads" that influence the product roadmap.
You will focus more on install base health, product adoption, proactive engagements, and account-team work.

Core - Deep Cloud Expertise

Be the trusted cloud engineer customers lean on for high-impact technical optimization work across cost, reliability, security, and performance.

Design and help implement solutions that:

improve cost efficiency (rightsizing, reservations/commitments, storage optimization, etc.)
increase reliability and resilience (HA/DR architectures, SLO/SLA-aware designs)
strengthen security posture (IAM, network segmentation, data protection, least-privilege)
reduce operational toil (automation, self-service, guardrails, policy enforcement)
Plan and deliver structured engagements such as Cloud Optimization Sessions, cost/efficiency/performance workshops, security posture or reliability reviews, and architecture deep dives / "well-architected" style assessments.
Respond to Expert Inquiry / support requests that require deep cloud engineering expertise, ensuring high-quality, well-explained resolutions.
Bring domain depth in:
ML / GenAI - deploying and operating ML/GenAI workloads (training and inference), GPU utilization, scaling, and cost control; MLOPS and integrating workloads with monitoring, logging, and FinOps; safe and efficient use of managed AI services.

Builder - Product Feedback & Contribution

Turn one-off field work into reusable assets that improve both customer outcomes and the product itself.

Convert one-off customer solutions into Gravel Roads - reusable patterns such as playbooks, Terraform modules, CloudFlow templates, cloud diagrams, Composer Recipes -> DCI Insights, and internal /external documentation.
Provide structured feedback to the DoiT Product and Engineering teams on:
product gaps and friction points discovered in real-world usage
new opportunities for automation and workload lenses within DCI
telemetry and tracking that would make future FDE work more efficient

Contribute directly to DCI where appropriate - from feature requests and feedback, to contributing code, to owning specific DCI features end-to-end. Build agent skills, scripts, and internal tooling that codify your expertise and scale it across the team. Contribute to internal enablement: share learnings via documentation, demos, office hours, or training sessions for other FDEs and Customer Success team members. Account Team - Embedded Execution

Operate as an embedded technical partner inside the account team.

Work in the account team model alongside Customer Success Managers (CSMs), Account Managers (AMs) to deliver impactful outcomes.
Own the technical depth lane: technical deployment & integration, automation & platform adoption, signal-based proactive engagement, and most importantly, repeatable Cloud Optimization solutions.
Partner with customers' engineers, architects, and FinOps teams to translate vague pain points into concrete technical optimization plans - and help them ship changes that stick and create continuous value.
Co-deliver complex or multi-domain engagements with peer FDEs (for example, infra + data + ML/GenAI), reviewing and refining designs, and engagement plans together.
Communicate complex technical topics clearly to both engineers and non-technical stakeholders (FinOps, finance, leadership), and maintain clear documentation of architectures, decisions, and implemented changes so customers and fellow FDEs can sustain and build on your work.
Contribute to a culture of continuous improvement within the global FDE community through design reviews, internal forums, enablement sessions, and experimentation.

Product Expert - DoiT Cloud Intelligence (DCI)

Become an expert in DCI and use it hands-on to drive concrete customer outcomes.

Master DoiT Cloud Intelligence products and services - including Cloud Analytics, DCI Insights, Cloud Composer, CloudFlow, DataHub, PerfectScale, and other Enterprise Platforms.
Use DCI hands-on to:
Build and operationalize Cloud Analytics and Allocations to create dashboards and reports for customer engineering, finance, and leadership.
Use DCI Insights to identify and prioritize cost, risk, and reliability opportunities, and shepherd them through to closure.
Implement Cloud Composer queries, build recipes that result in hand-crafted insights across all customers' engineering use cases.
Build CloudFlow automations (e.g., anomaly routing, scheduled actions, guardrails, policy enforcement).
Use Built in Integrations such and utilize DataHub and other workload-intelligence features to optimize key business and workload data inside DCI.

Help customers embed DCI into existing observability, CI/CD, and governance processes so it becomes trusted and indispensable in day-to-day cloud operations.

Requirements

4+ years of experience architecting, deploying, and managing cloud-based AI/ML solutions, including production workloads.
Proven track record designing and operating large, distributed systems on AWS, selecting appropriate services and patterns to meet business and technical goals.

AWS & GenAI / ML Expertise

Advanced proficiency with AWS services relevant to AI/ML and GenAI.
Hands-on experience with Amazon Bedrock for deploying and scaling foundation models and Generative AI workloads.
Experience fine-tuning and deploying Large Language Models (LLMs) and multimodal AI using Amazon SageMaker (including JumpStart).
Strong prompt engineering skills and familiarity with rigorous model evaluation (quality, safety, performance).
Understanding of agentic capabilities and patterns for AI agents that autonomously perform tasks and integrate with existing systems.
Experience with Amazon Q Business and Amazon Q Developer (or similar tools) to accelerate insight generation and development workflows.

ML Pipelines, Data & MLOps

In-depth knowledge of Amazon SageMaker components such as Pipelines, Model Monitor, Data Wrangler, and SageMaker Clarify for bias detection and interpretability.
Proficiency integrating TensorFlow, PyTorch, and other ML frameworks with SageMaker for model development, fine-tuning, and deployment.
Experience with distributed training (multi-GPU or multi-node) and performance optimization for inference.
Strong data-engineering skills on AWS: Amazon S3, AWS Glue, Lake Formation, Redshift for AI/ML data pipelines.
Experience building end-to-end AI/ML workflows using services like AWS Lambda, Step Functions, API Gateway, and containerized deployments on Amazon EKS / AWS Fargate.

DevOps, MLOps, Governance & Security

Hands-on experience with CI/CD for AI/ML using AWS CodePipeline, CodeBuild, SageMaker Pipelines, or similar.
Proficiency in monitoring and operating AI systems using Amazon CloudWatch and SageMaker Model Monitor.
Strong understanding of AI governance, security, and compliance on AWS, including IAM, KMS, and data privacy patterns.
Familiarity with AI ethics and bias detection/mitigation (e.g., using SageMaker Clarify or similar tools).

Multi-Cloud Awareness & Collaboration

Working knowledge of Google Cloud AI tools (e.g., Vertex AI, Cloud AutoML, BigQuery ML) sufficient to reason about multi-cloud architectures and integration points.
Proven ability to mentor peers, run enablement sessions, and collaborate across Sales, CS, and Product.

Soft Skills

Excellent communication skills across technical and business audiences; able to simplify complex ideas and influence decisions.
Natural ownership mentality: you escalate early, resolve fast, and own the outcome.
Demonstrated ability to work effectively in a remote-first, global environment., * BA/BS degree in Computer Science, Mathematics, or a related technical field, or equivalent practical experience.
Additional data or AI certifications (e.g., AWS/GCP data certifications, reputable AI/ML programs such as Stanford, Coursera, Udacity, MIT, eCornell).

Expanded AI/ML & Dev Experience

Experience with modern RLHF, advanced fine-tuning techniques, and hybrid AI architectures.
Familiarity with Hugging Face or similar open-source ecosystems integrated with AWS.
Prior experience as a ML Engineer, Data Scientist, or AI-focused Architect in a consulting or SaaS environment.

Tooling & Process

Experience with JIRA or similar tools for tracking work across delivery and product-feedback cycles.
Exposure to Agile practices and frameworks commonly used for SaaS and cloud delivery.

Are you a Do'er?

Be your truest self. Work on your terms. Make a difference.

We are home to a global team of incredible talent who work remotely and have the flexibility to have a schedule that balances your work and home life. We embrace and support leveling up your skills professionally and personally.

About the company

DoiT is a global technology company that works with cloud-driven organizations to leverage public cloud to drive business growth and innovation. We combine data, technology, and human expertise to ensure our customers operate in a well-architected and scalable state-from planning to production. Delivering DoiT Cloud Intelligence, the only solution that integrates advanced technology with human intelligence, we help our customers solve complex multicloud problems and drive efficiency. With decades of multicloud experience, we have specializations in Kubernetes, GenAI, CloudOps, and more. An award-winning strategic partner of AWS, Google Cloud, and Microsoft Azure, we work alongside more than 4,000 customers worldwide. As a Senior Cloud Architect, you will be part of our global Forward Deployed Engineering organization, working with rapidly growing companies in EMEA and around the world. This role sits within FDE Delivery and focuses on our install base, product adoption and customer health., Full-time employee benefits include: * Unlimited Vacation * Flexible Working Options * Health Insurance * Parental Leave * Employee Stock Option Plan * Home Office Allowance * Professional Development Stipend * Peer Recognition Program Many Do'ers, One Team DoiT unites as Many Do'ers, One Team, where diversity is more than a goal-it's our strength. We actively cultivate an inclusive, equitable workplace, recognizing that each unique perspective enhances our innovation. By celebrating differences, we create an environment where every individual feels valued, contributing to our collective success. #LI-Remote

Role details

Job location

Tech stack

Job description

Requirements

About the company

Apply for this position

Good distractions

Moments

Videos View all