AI Engineer

Engineering, Inc.

San Francisco, United States of America

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

San Francisco, United States of America

Tech stack

Java

Artificial Intelligence

C Sharp (Programming Language)

C++

Continuous Integration

Python

Machine Learning

Performance Tuning

Software Deployment

Data Logging

Data Processing

Graphics Processing Unit (GPU)

Data Ingestion

Containerization

Machine Learning Operations

Hardware Infrastructure

Trident

Data Pipelines

Docker

Microservices

Job description

We're seeking a hands-on AI Engineer who builds production-ready AI systems, not research prototypes. You'll optimize our AI ingestion pipeline for more accurate, responsive agentic behavior, deploy high-performance models on GPU infrastructure using our Trident architecture, and maintain robust MLOps workflows from training through production deployment. This is for engineers who ship code, not just notebooks., Enhance AI Pipeline Accuracy: Improve our data ingestion and processing pipeline to deliver more accurate responses and sophisticated agentic behaviors in production applications. GPU-Optimized Model Deployment: Deploy and optimize AI models on high-performance GPU infrastructure using our Trident architecture, ensuring efficient training, inference, and scaling. Production MLOps: Build and maintain end-to-end MLOps pipelines including RAG systems, model distillation, fine-tuning workflows, training orchestration, and production inference deployment. Data Model Engineering: Design and implement robust data models and processing workflows that power our AI persona capabilities. Infrastructure & DevOps: Create production-grade CI/CD pipelines, containerization (Docker), comprehensive logging systems, and monitoring for AI model performance. Real Production Deployment: Take AI systems from development through production deployment, focusing on reliability, performance, and operational excellence. Required Technical Skills

Requirements

Python (primary language for AI/ML work) Strong proficiency in C++, Java, or C# for performance-critical components Data modeling and processing at production scale AI/ML Production Stack: RAG Pipeline development and optimization MLOps workflows: training, inference, model lifecycle management Model distillation and fine-tuning techniques for production deployment Experience deploying models to GPU infrastructure (Trident or similar architectures) Production Engineering: CI/CD pipeline creation and management Docker containerization and microservices architecture Production logging, monitoring, and observability Experience scaling AI systems in real production environments What We DO Want 3-5 years of production AI/ML engineering experience Engineers from mid-sized companies who have successfully deployed AI systems at scale Proven track record of building, deploying, and maintaining ML systems in production Experience optimizing AI systems for performance, cost, and reliability Strong system design and architecture skills for scalable AI applications Sample Projects You'll Own Optimize our RAG pipeline for improved accuracy and response quality Deploy and scale transformer models on our Trident GPU architecture Build MLOps workflows for continuous model training and deployment Design data processing systems for multi-modal AI persona training