Sr. AI Systems Engineer

Super Micro Computer, Inc.

San Jose, United States of America

3 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

$ 190K

Job location

San Jose, United States of America

Tech stack

Artificial Intelligence

Computer Clusters

Nvidia CUDA

Databases

Web Scraping

Cursor (Graphical User Interface Elements)

Database Design

Linux

Elasticsearch

Python

PostgreSQL

Microsoft Office

Microsoft SQL Server

MongoDB

MySQL

Performance Tuning

Recommender Systems

SAP Applications

Selenium

SharePoint

Software Deployment

Software Engineering

Unstructured Data

Chatbots

React

Large Language Models

Multi-Agent Systems

Web Content

Backend

FastAPI

Vue.js

AI Platforms

Information Technology

HuggingFace

Playwright

Search Engines

Front End Software Development

Hardware Infrastructure

REST

Data Pipelines

Docker

Job description

Supermicro's AI Platform team is seeking a 100x AI Systems Engineer who ships entire systems, not just code. Our existing AI platform - 38,000+ documents, Agentic RAG, hybrid search, custom embedding and reranking services, multi-database integration - was built by one engineer in under one month. Another team estimated the same scope at 100 person-months. We are looking for another engineer with that same capacity. This individual uses AI coding tools (Cursor, Claude Code, etc.) as daily drivers, owns the full stack from data pipeline to production deployment, and delivers working solutions - not PRs that need "a few more iterations." Given a problem, this individual ships. Given ambiguity, this individual asks the right questions, prototypes fast, and iterates. New domain? Ramp up fast. No existing solution? Build one. Blocked? Find a workaround or do it yourself., The following responsibilities are integral to the Sr. AI Systems Engineer role (other duties may also be assigned):

Design and ship AI-powered systems end-to-end: RAG pipelines, agentic workflows, intelligent chatbots for customer support, AI-powered product recommendation engines, automated RFQ assistants, and sales enablement tools - the full roadmap, one system at a time.
Build and maintain hybrid search infrastructure combining vector databases (Qdrant, Chroma, Milvus) with keyword search (Elasticsearch/BM25), custom embedding services, and reranking pipelines.
Deploy and optimize LLM inference services using vLLM, SGLang, or equivalent frameworks on GPU clusters (H100, H200, GH200); expand and maintain GPU inference infrastructure as the platform grows.
Build document processing pipelines for large-scale ingestion of unstructured data - PDFs, Office documents, web content - covering extraction, chunking, contextual retrieval, and metadata enrichment.
Integrate AI systems with enterprise data sources including PostgreSQL, MSSQL, SharePoint, and SAP; expose capabilities through RESTful APIs (FastAPI).
Leverage AI-assisted development tools (Cursor, Claude Code, etc.) as core productivity multipliers - architect the 20% and let AI write the 80%, shipping in weeks what would otherwise take a team months.
Identify the next high-impact problem across sales, engineering, customer support, and operations; prototype fast, ship to production, and move on to the next one.

Requirements

Bachelor's degree in Computer Science, Electrical Engineering, or a related field, with 8+ years of software engineering experience; what you have shipped matters far more than years counted; Master's degree is a plus.
Demonstrated AI-assisted development proficiency - show us projects where you used Cursor, Claude Code, or similar AI tools to 10x your output; this is a hard requirement, not a nice-to-have.
Hands-on experience building RAG systems: vector databases, embedding models, reranking pipelines, and hybrid search architectures.
Strong Python proficiency for backend development, LLM orchestration, and automation; experience with FastAPI or similar frameworks.
Experience with LLM deployment and inference optimization (vLLM, SGLang, or equivalent); comfortable with Linux, Docker, and GPU infrastructure basics (CUDA).
Self-directed and comfortable with ambiguity: "make our sales team more productive with AI" is enough to get started.
Experience with web scraping and data ingestion pipelines (Playwright, Selenium, Crawl4AI, etc.) is a plus.
Familiarity with agent frameworks (LangChain, LangGraph) and document processing tools (PDF extraction, OCR) is a plus.
Experience with enterprise data sources (SharePoint, MSSQL, SAP) and database design (PostgreSQL, MySQL, MongoDB) is a plus.
LLM fine-tuning experience (LoRA, PEFT, Hugging Face) and frontend basics (React, Vue) are a plus.

Benefits & conditions

$170,000 - $190,000

The salary offered will depend on several factors, including your location, level, education, training, specific skills, years of experience, and comparison to other employees already in this role. In addition to a comprehensive benefits package, candidates may be eligible for other forms of compensation, such as participation in bonus and equity award programs.

About the company

Supermicro® is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing company among the Silicon Valley Top 50 technology firms. Our unprecedented global expansion has provided us with the opportunity to offer a large number of new positions to the technology community. We seek talented, passionate, and committed engineers, technologists, and business leaders to join us.

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

About the company

Apply for this position

Good distractions

Moments

Videos View all