Staff AI Platform Engineer

CARPARTS INC.
Long Beach, United States of America
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 232K

Job location

Long Beach, United States of America

Tech stack

API
Artificial Intelligence
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Azure
Continuous Integration
DevOps
Github
Identity and Access Management
Python
Log Analysis
Microsoft Message Queuing
Node.js
Release Management
Azure
Akamai
Workflow Management Systems
Autoscaling
Large Language Models
Multi-Agent Systems
Prompt Engineering
Containerization
WebPack
AI Platforms
Kubernetes
Performance Monitor
Build Tools
Front End Software Development
Virtual Agents
Cloudwatch
Kibana
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Jenkins

Job description

This is not a standard DevOps posting. We are looking for one unusually capable, AI-native engineer to own our entire platform engineering and SRE function - using autonomous agents, LLM-powered pipelines, and MCP-based tooling as force multipliers to do the work of a team, on-site, in close partnership with our engineering leadership.

You will inherit a mature, fully containerized AWS estate (9 EKS clusters, 27 accounts, 228 Kubernetes nodes), an Akamai CDN layer managing live traffic splits, GitHub Actions + Jenkins CI/CD pipelines for a Webpack 5 micro-frontend monorepo, and an operational AI agent platform - OpsWhisperer - already in production monitoring 25 AWS accounts with a 91% autonomous resolution.

Your job is to extend all of it, automate what remains manual, and be the person who makes every deployment, incident, and infrastructure change happen with speed, precision, and intelligence.

SCOPE OF OWNERSHIP

What you'll own

AWS Multi-Account Infrastructure

  • EKS clusters across dedicated AWS accounts
  • EC2 worker nodes via Auto Scaling Groups
  • SQS pipelines
  • AWS Bedrock (Claude) for AI agent workloads

Kubernetes & Containerization

  • EKS clusters
  • Node group mgmt
  • Kops clusters alongside EKS
  • Multiple environment tiers with full blast-radius isolation

CI/CD & Release Management

  • Multiple Repos
  • GitHub Actions workflows + Jenkins pipeline management
  • Turbo build system across multiple micro-frontend packages
  • Canary release gating and rollback automation

CDN & Traffic Management

  • Akamai Property Manager config
  • Phased Release Cloudlet for Canary and Production split
  • Security, Throttling and Monitoring
  • Jenkins-driven cache invalidation

Observability & Incident Response

  • Elastic/Kibana
  • CloudWatch across all AWS accounts
  • Business performance monitoring
  • SQS backlog + pipeline health alerting
  • On-call ownership, proactive, AI-assisted triage, AWS EKS · Kubernetes · Kops · AWS Organizations · Auto Scaling Groups · AWS SQS · AWS Bedrock · CloudWatch

CDN & Networking

Akamai Property Manager · Phased Release Cloudlet · Fast Purge · · Content Protector

CI/CD & Frontend

GitHub Actions · Jenkins · Turbo (monorepo) · Webpack 5 Module Federation · Canary / Blue-Green Deployments

AI & Agentic

MCP (Model Context Protocol) · Claude API / AWS Bedrock · Azure Bot Service · Microsoft Entra ID · Operational AI Agents

Requirements

Do you have experience in Tooling?, * 10+ years of hands-on DevOps, SRE, or platform engineering experience in production AWS cloud environments.

  • Deep AWS expertise: EKS, EC2, SQS, CloudWatch, IAM, Organizations, and multi-account architectures
  • Strong Kubernetes skills: cluster operations, node group management, workload isolation, taints/tolerations, auto-scaling
  • Experience with Akamai or equivalent enterprise CDN - configuration, purge operations, traffic routing rules
  • CI/CD ownership: GitHub Actions and/or Jenkins pipeline design, monorepo build systems, release gating
  • Production experience building or operating AI agents - LLM integration, autonomous workflow design, prompt engineering
  • Proficiency in Node.js and/or Python for automation, tooling, and MCP server development
  • Observability stack ownership: Elastic/Kibana, log analysis, alerting design, SLO/SLI instrumentation
  • Comfortable owning on-call responsibility for a production e-commerce platform with significant revenue exposure
  • Strong written and verbal communication - will interface with engineering leadership and present findings to executives
  • Based in or willing to relocate to the Los Angeles / Long Beach area for on-site work

Benefits & conditions

3.33.3 out of 5 stars 4910 Airport Plaza Drive, Long Beach, CA 90815 $166,000 - $232,000 a year, Pulled from the full job description

  • Opportunities for advancement

About the company

CarParts.com is the go-to eCommerce platform for auto care and maintenance. We provide drivers with quality parts at competitive prices and enable them to schedule appointments with trusted mechanics directly through our website. Using world-class design principles and the latest technologies, we deliver a fast, intuitive digital experience backed by our company-owned national distribution network. With over 1,000 employees worldwide, we are scaling rapidly, fueled by our most recent strategic partnership and $35 million investment. This positions us for the next phase of growth as we continue to empower drivers along their journey. We've built Axle - CarParts.com's domain AI platform and winner of the MACH Alliance Impact Award for Best Multi-Agent Ecosystem - and we're expanding it. This role is central to that expansion. Our Culture At CarParts.com, our culture goes beyond our core values of Safety First, Customer Focused, and Commitment to Excellence. We are a performance-driven, data-focused, and fast-paced team where results matter and winning is expected. * Hungry & Hardworking: We set ambitious goals, measure progress with clear metrics, and hold ourselves accountable to deliver results. * Promote from Within: We reward top performers with opportunities for growth and advancement. * Collaborative & In-Person: We believe the best ideas and fastest execution happen face-to-face. - High Standards: We move quickly, pay attention to details, and dig deep - whether it's analyzing contracts, aggregating complex scenarios, or building clear, data-driven presentations. * No Passengers: We value grit, ownership, and the relentless pursuit of results

Apply for this position