Site Reliability Engineer, Factory Software

Tesla, Inc.
Fremont, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 210K

Job location

Fremont, United States of America

Tech stack

Java
Applications Architecture
Software Applications
Databases
DevOps
Middleware
Github
Python
Linux System Administration
Networking Basics
Routing
Reliability Engineering
Prometheus
Software Engineering
Virtual Local Area Networks
Virtualization Technology
Data Logging
Load Balancing
Grafana
Firewalls (Computer Science)
GIT
Data Layers
Containerization
Kubernetes
Production Code
Splunk
Dynatrace
Docker

Job description

This role sits at the intersection of infrastructure (Kubernetes clusters, VMs, servers, and databases) and the software applications running on top of them. As an SRE on the Factory Software team, you will own the reliability of the full stack - from compute and data layers to the middleware connecting factory equipment (including PLCs) with MES systems and other services. You will implement advanced monitoring and observability to detect issues before they impact production. Your mission is to make both infrastructure and applications more reliable, observable, and standardized by catching speed bottlenecks, database contention, and infrastructure problems early while driving best practices and tooling consistency across teams. What You'll Do

  • Design, implement, and evolve end-to-end observability and telemetry across services and infrastructure, including OTEL instrumentation, logging, metrics, and distributed tracing
  • Build and maintain robust monitoring using Prometheus, Grafana, Tempo, and related tools to proactively detect speed bottlenecks, database contention, resource exhaustion, and infrastructure issues
  • Define and track SLIs, SLOs, and error budgets; standardize observability practices, golden signals, and tooling across engineering teams
  • Implement effective, low-noise alerting systems and drive strong incident response processes
  • Collaborate closely with Platform, Infrastructure, and Software Engineering teams to embed reliability and observability into the development lifecycle
  • Write production-grade code to reduce toil, automate operations, manage deployments, and treat infrastructure and reliability as a software engineering problem
  • Consult on infrastructure, systems, and application architecture with a reliability-first mindset
  • Participate in on-call rotations, live troubleshooting on NOC bridges/outage calls, and blameless post-mortems
  • Document solutions, create and maintain technical documentation, and actively mentor engineers across the organization

Requirements

Do you have experience in VLAN?, * 4+ years of experience in Site Reliability Engineering, Platform Engineering, DevOps, or a closely related systems role

  • Working knowledge of Kubernetes and hands-on experience with Docker/containerization or virtualization
  • Strong understanding of Observability concepts with practical experience using Prometheus, Grafana, Tempo, and/or Splunk
  • Expert-level Linux administration skills
  • Solid understanding of networking fundamentals (routing, switching, VLANs, firewalls, and load balancers)
  • Experience with Git and CI/CD pipelines (GitHub Actions is a plus)
  • Proficiency in at least one high-level language (Go, Python, or Java) with demonstrable experience writing production-grade code
  • Comfortable with on-call rotations and performing live troubleshooting during outages
  • Strong documentation habits and a track record of effectively sharing knowledge across teams
  • Strong bias for action - comfortable getting hands dirty, shipping solutions quickly, and learning from mistakes

Benefits & conditions

Pulled from the full job description

  • Flextime
  • Pet insurance
  • AD&D insurance
  • Health insurance
  • 401(k) matching
  • Employee discount
  • Health savings account, Along with competitive pay, as a full-time Tesla employee, you are eligible for the following benefits at day 1 of hire:
  • Medical plans > plan options with $0 payroll deduction
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental (including orthodontic coverage) and vision plans, both have options with a $0 paycheck contribution
  • Company Paid (Health Savings Accounts) HSA Contribution when enrolled in the High-Deductible medical plan with HSA
  • Healthcare and Dependent Care Flexible Spending Accounts (FSA)
  • 401(k) with employer match, Employee Stock Purchase Plans, and other financial benefits
  • Company paid Basic Life, AD&D
  • Short-term and long-term disability insurance (90 day waiting period)
  • Employee Assistance Program
  • Sick and Vacation time (Flex time for salary positions, Accrued hours for Hourly positions), and Paid Holidays
  • Back-up childcare and parenting support resources
  • Voluntary benefits to include: critical illness, hospital indemnity, accident insurance, theft & legal services, and pet insurance
  • Weight Loss and Tobacco Cessation Programs
  • Tesla Babies program
  • Commuter benefits
  • Employee discounts and perks program

Expected Compensation $140,000 - $210,000/annual salary + cash and stock awards + benefits

Pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. The total compensation package for this position may also include other elements dependent on the position offered. Details of participation in these benefit plans will be provided if an employee receives an offer of employment. Tesla is an Equal Opportunity / Affirmative Action employer committed to diversity in the workplace. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, age, national origin, disability, protected veteran status, gender identity or any other factor protected by applicable federal, state or local laws.

Tesla is also committed to working with and providing reasonable accommodations to individuals with disabilities. Please let your recruiter know if you need an accommodation at any point during the interview process.

Apply for this position