Reliability Engineer (Michigan)

Oracle
Austin, United States of America
15 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Ann Arbor, United States of America

Tech stack

Artificial Intelligence
Computerized Maintenance Management Systems
Data Centers
Oracle Applications
Program Design Languages
System Testing
Oracle Cloud Infrastructure

Job description

This position requires U.S. Citizenship and will be full-time on-site at Oracle's new Michigan AI data center, located in Saline Township, 35 miles southwest of Ann Arbor, Michigan. Relocation assistance may be available in accordance with Oracle's relocation policies. As a Reliability Engineer - Data Center Facilities, NA, you will support the operational health, maintainability, and reliability of mission-critical facility systems across OCI's North America data center portfolio. This role contributes to commissioning readiness, maintenance program design, failure analysis, and technical support to Site Operations across electrical, mechanical, and associated controls systems. You will work cross-functionally with Site Operations, Design Engineering, Construction, Building Automation, Commissioning, and Reliability peers to help ensure critical infrastructure is supportable, reliable, and ready for sustained operations.

Tracks and monitors ongoing Data Center critical infrastructure maintenance and repair for all service lines to pre-defined service level agreements (SLAs). Manages incidents that impact Data Center infrastructure services and the proactive and timely resolution of such incidents. Conducts site reviews and assessments to evaluate suitability for data center builds. Acts as the engineering representative on a wide range of moderately complex on-site scenarios related to mission critical systems, operations, and functionality. Provides engineering insight to ensure project or other design initiatives align with company expectations. Contributes to the identification of training programs for newer members of the team, acting as a subject matter expert with many standard systems and trains others on the team.

  • Support reliability activities for critical electrical, mechanical, and controls-related infrastructure across assigned sites or programs.
  • Review commissioning and startup plans to ensure systems meet design intent and are operationally supportable at turnover.
  • Assist in developing maintenance programs that improve operability, reduce downtime, and balance lifecycle cost.
  • Analyze equipment performance, maintenance data, and operational trends to identify risks and improvement opportunities.
  • Support root cause analysis and corrective action development for reliability-related issues and recurring failures.
  • Partner with Site Operations to provide technical guidance during equipment failures, abnormal conditions, or troubleshooting efforts.
  • Review construction submittals, O&M documentation, and turnover materials to evaluate maintainability and operational readiness.
  • Support risk assessments, spare parts analysis, lifecycle planning, and end-of-useful-life considerations for critical assets.
  • Contribute feedback to Design Engineering teams on reliability, maintainability, and operating experience from live sites.
  • Help improve site response procedures, documentation quality, and repeatable reliability practices across the portfolio.

Requirements

  • 3-5 years of experience in critical facilities, data center operations, industrial maintenance, commissioning, or reliability-related environments.
  • Working knowledge of mission-critical facility systems across electrical, mechanical, and/or controls domains.
  • Experience supporting maintenance planning, system testing, troubleshooting, or failure analysis in operational environments.
  • Bachelor's degree in Engineering or related field preferred; equivalent field experience also valued.

Skills and Competencies

  • Strong analytical and problem-solving capability.
  • Ability to work across multiple teams in a fast-paced environment.
  • Strong written and verbal communication skills.
  • Attention to detail and process discipline.
  • Ability to balance technical rigor with practical operational needs.

Preferred Skills / Certifications

  • Familiarity with CMMS, asset management systems, commissioning processes, or maintenance planning tools.
  • Exposure to RAM analysis, spare parts analysis, or lifecycle cost analysis.
  • Working knowledge of one-lines, P&IDs, sequences of operation, or controls architecture documentation.
  • Data center, utility, healthcare, semiconductor, telecom, or other uptime-critical experience is a plus.

Physical Demands / Work Environment This role supports mission-critical data center environments where reliability, responsiveness, and execution discipline are essential. Travel may be required to support site reviews, turnover activities, incident follow-up, and cross-functional coordination. You must be able to walk sites, climb stairs, and work safely in active operational environments, with or without reasonable accommodation. Source roles also note occasional lifting up to 25 pounds.

About the company

 Oracle offers integrated suites of applications plus secure, autonomous infrastructure in the Oracle Cloud. For more information about Oracle (NYSE: ORCL), please visit us at www.oracle.com.

Our mission is to help people see data in new ways, discover insights, unlock endless possibilities.

Apply for this position