Sr. AI Systems Engineer

Super Micro Computer, Inc.
San Jose, United States of America
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 190K

Job location

San Jose, United States of America

Tech stack

Artificial Intelligence
Computer Clusters
Nvidia CUDA
Databases
Web Scraping
Cursor (Graphical User Interface Elements)
Database Design
Linux
Elasticsearch
Python
PostgreSQL
Microsoft Office
Microsoft SQL Server
MongoDB
MySQL
Performance Tuning
Recommender Systems
SAP Applications
Selenium
SharePoint
Software Deployment
Software Engineering
Unstructured Data
Chatbots
React
Large Language Models
Multi-Agent Systems
Web Content
Backend
FastAPI
Vue.js
AI Platforms
Information Technology
HuggingFace
Playwright
Search Engines
Front End Software Development
Hardware Infrastructure
REST
Data Pipelines
Docker

Job description

Supermicro's AI Platform team is seeking a 100x AI Systems Engineer who ships entire systems, not just code. Our existing AI platform - 38,000+ documents, Agentic RAG, hybrid search, custom embedding and reranking services, multi-database integration - was built by one engineer in under one month. Another team estimated the same scope at 100 person-months. We are looking for another engineer with that same capacity. This individual uses AI coding tools (Cursor, Claude Code, etc.) as daily drivers, owns the full stack from data pipeline to production deployment, and delivers working solutions - not PRs that need "a few more iterations." Given a problem, this individual ships. Given ambiguity, this individual asks the right questions, prototypes fast, and iterates. New domain? Ramp up fast. No existing solution? Build one. Blocked? Find a workaround or do it yourself., The following responsibilities are integral to the Sr. AI Systems Engineer role (other duties may also be assigned):

  • Design and ship AI-powered systems end-to-end: RAG pipelines, agentic workflows, intelligent chatbots for customer support, AI-powered product recommendation engines, automated RFQ assistants, and sales enablement tools - the full roadmap, one system at a time.
  • Build and maintain hybrid search infrastructure combining vector databases (Qdrant, Chroma, Milvus) with keyword search (Elasticsearch/BM25), custom embedding services, and reranking pipelines.
  • Deploy and optimize LLM inference services using vLLM, SGLang, or equivalent frameworks on GPU clusters (H100, H200, GH200); expand and maintain GPU inference infrastructure as the platform grows.
  • Build document processing pipelines for large-scale ingestion of unstructured data - PDFs, Office documents, web content - covering extraction, chunking, contextual retrieval, and metadata enrichment.
  • Integrate AI systems with enterprise data sources including PostgreSQL, MSSQL, SharePoint, and SAP; expose capabilities through RESTful APIs (FastAPI).
  • Leverage AI-assisted development tools (Cursor, Claude Code, etc.) as core productivity multipliers - architect the 20% and let AI write the 80%, shipping in weeks what would otherwise take a team months.
  • Identify the next high-impact problem across sales, engineering, customer support, and operations; prototype fast, ship to production, and move on to the next one.

Requirements

  • Bachelor's degree in Computer Science, Electrical Engineering, or a related field, with 8+ years of software engineering experience; what you have shipped matters far more than years counted; Master's degree is a plus.
  • Demonstrated AI-assisted development proficiency - show us projects where you used Cursor, Claude Code, or similar AI tools to 10x your output; this is a hard requirement, not a nice-to-have.
  • Hands-on experience building RAG systems: vector databases, embedding models, reranking pipelines, and hybrid search architectures.
  • Strong Python proficiency for backend development, LLM orchestration, and automation; experience with FastAPI or similar frameworks.
  • Experience with LLM deployment and inference optimization (vLLM, SGLang, or equivalent); comfortable with Linux, Docker, and GPU infrastructure basics (CUDA).
  • Self-directed and comfortable with ambiguity: "make our sales team more productive with AI" is enough to get started.
  • Experience with web scraping and data ingestion pipelines (Playwright, Selenium, Crawl4AI, etc.) is a plus.
  • Familiarity with agent frameworks (LangChain, LangGraph) and document processing tools (PDF extraction, OCR) is a plus.
  • Experience with enterprise data sources (SharePoint, MSSQL, SAP) and database design (PostgreSQL, MySQL, MongoDB) is a plus.
  • LLM fine-tuning experience (LoRA, PEFT, Hugging Face) and frontend basics (React, Vue) are a plus.

Benefits & conditions

$170,000 - $190,000

The salary offered will depend on several factors, including your location, level, education, training, specific skills, years of experience, and comparison to other employees already in this role. In addition to a comprehensive benefits package, candidates may be eligible for other forms of compensation, such as participation in bonus and equity award programs.

About the company

Supermicro® is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing company among the Silicon Valley Top 50 technology firms. Our unprecedented global expansion has provided us with the opportunity to offer a large number of new positions to the technology community. We seek talented, passionate, and committed engineers, technologists, and business leaders to join us.

Apply for this position