AI Engineer
Relanto, Inc.
Oakland, United States of America
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
IntermediateJob location
Oakland, United States of America
Tech stack
API
Artificial Intelligence
Automation of Tests
Google BigQuery
Cloud Computing
Continuous Integration
Information Engineering
Monitoring of Systems
Python
Machine Learning
Performance Tuning
TensorFlow
Search Technologies
Software Deployment
Google Cloud Platform
PyTorch
Flask
Delivery Pipeline
Large Language Models
Prompt Engineering
Generative AI
FastAPI
Containerization
AI Platforms
Kubernetes
Low Latency
Deployment Automation
Machine Learning Operations
REST
Automation Anywhere
Docker
Job description
- Design and implement end-to-end ML pipelines on Google Cloud Platform (Google Cloud Platform)
- Build, fine-tune, and optimize AI/ML models for production deployment using Vertex AI and Gemini models
- Develop Generative AI solutions leveraging Gemini APIs, prompt engineering, Retrieval-Augmented Generation (RAG), and multimodal AI capabilities
- Develop and maintain RESTful APIs for ML and GenAI model serving using FastAPI/Flask
- Implement vector search and semantic retrieval capabilities using BigQuery ML, Vertex AI Matching Engine, and embeddings
- Create automated testing, validation, CI/CD, and deployment pipelines for ML/AI workflows
- Set up model monitoring, observability, drift detection, and performance tracking for production AI systems
- Optimize model inference, scalability, latency, and serving infrastructure on Google Cloud Platform
- Collaborate with data engineering, product, and business teams to deliver scalable AI-driven applications
- Work with containerized deployments using Docker and Google Kubernetes Engine (GKE)
Requirements
- Must have minimum 4+ years of relevant experience
- Strong Python programming skills with ML frameworks such as PyTorch and TensorFlow
- Hands-on experience with Large Language Models (LLMs), Generative AI, and prompt engineering
- Experience working with Google Gemini models and Gemini APIs for GenAI use cases
- Strong proficiency in Google Cloud Platform AI services including Vertex AI, Gemini, Cloud ML Engine, BigQuery ML, and Vertex AI Matching Engine
- Experience implementing vector databases/search, embeddings, semantic search, and RAG architectures
- Expertise in RESTful API development using FastAPI or Flask
- Experience with Docker, Kubernetes, and Google Kubernetes Engine (GKE)
- Strong understanding of CI/CD pipelines and MLOps workflows for ML/AI deployments
- Experience with model monitoring, performance optimization, and scalable AI serving architecture