AI Engineer

Engineering, Inc.
San Francisco, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

San Francisco, United States of America

Tech stack

Java
Artificial Intelligence
C Sharp (Programming Language)
C++
Continuous Integration
Python
Machine Learning
Performance Tuning
Software Deployment
Data Logging
Data Processing
Graphics Processing Unit (GPU)
Data Ingestion
Containerization
Machine Learning Operations
Hardware Infrastructure
Trident
Data Pipelines
Docker
Microservices

Job description

We're seeking a hands-on AI Engineer who builds production-ready AI systems, not research prototypes. You'll optimize our AI ingestion pipeline for more accurate, responsive agentic behavior, deploy high-performance models on GPU infrastructure using our Trident architecture, and maintain robust MLOps workflows from training through production deployment. This is for engineers who ship code, not just notebooks., Enhance AI Pipeline Accuracy: Improve our data ingestion and processing pipeline to deliver more accurate responses and sophisticated agentic behaviors in production applications. GPU-Optimized Model Deployment: Deploy and optimize AI models on high-performance GPU infrastructure using our Trident architecture, ensuring efficient training, inference, and scaling. Production MLOps: Build and maintain end-to-end MLOps pipelines including RAG systems, model distillation, fine-tuning workflows, training orchestration, and production inference deployment. Data Model Engineering: Design and implement robust data models and processing workflows that power our AI persona capabilities. Infrastructure & DevOps: Create production-grade CI/CD pipelines, containerization (Docker), comprehensive logging systems, and monitoring for AI model performance. Real Production Deployment: Take AI systems from development through production deployment, focusing on reliability, performance, and operational excellence. Required Technical Skills

Requirements

Python (primary language for AI/ML work) Strong proficiency in C++, Java, or C# for performance-critical components Data modeling and processing at production scale AI/ML Production Stack: RAG Pipeline development and optimization MLOps workflows: training, inference, model lifecycle management Model distillation and fine-tuning techniques for production deployment Experience deploying models to GPU infrastructure (Trident or similar architectures) Production Engineering: CI/CD pipeline creation and management Docker containerization and microservices architecture Production logging, monitoring, and observability Experience scaling AI systems in real production environments What We DO Want 3-5 years of production AI/ML engineering experience Engineers from mid-sized companies who have successfully deployed AI systems at scale Proven track record of building, deploying, and maintaining ML systems in production Experience optimizing AI systems for performance, cost, and reliability Strong system design and architecture skills for scalable AI applications Sample Projects You'll Own Optimize our RAG pipeline for improved accuracy and response quality Deploy and scale transformer models on our Trident GPU architecture Build MLOps workflows for continuous model training and deployment Design data processing systems for multi-modal AI persona training

Apply for this position