Reliability Engineer (Michigan)

Oracle

Austin, United States of America

15 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

Ann Arbor, United States of America

Tech stack

Artificial Intelligence

Computerized Maintenance Management Systems

Data Centers

Oracle Applications

Program Design Languages

System Testing

Oracle Cloud Infrastructure

Job description

This position requires U.S. Citizenship and will be full-time on-site at Oracle's new Michigan AI data center, located in Saline Township, 35 miles southwest of Ann Arbor, Michigan. Relocation assistance may be available in accordance with Oracle's relocation policies. As a Reliability Engineer - Data Center Facilities, NA, you will support the operational health, maintainability, and reliability of mission-critical facility systems across OCI's North America data center portfolio. This role contributes to commissioning readiness, maintenance program design, failure analysis, and technical support to Site Operations across electrical, mechanical, and associated controls systems. You will work cross-functionally with Site Operations, Design Engineering, Construction, Building Automation, Commissioning, and Reliability peers to help ensure critical infrastructure is supportable, reliable, and ready for sustained operations.

Tracks and monitors ongoing Data Center critical infrastructure maintenance and repair for all service lines to pre-defined service level agreements (SLAs). Manages incidents that impact Data Center infrastructure services and the proactive and timely resolution of such incidents. Conducts site reviews and assessments to evaluate suitability for data center builds. Acts as the engineering representative on a wide range of moderately complex on-site scenarios related to mission critical systems, operations, and functionality. Provides engineering insight to ensure project or other design initiatives align with company expectations. Contributes to the identification of training programs for newer members of the team, acting as a subject matter expert with many standard systems and trains others on the team.

Support reliability activities for critical electrical, mechanical, and controls-related infrastructure across assigned sites or programs.
Review commissioning and startup plans to ensure systems meet design intent and are operationally supportable at turnover.
Assist in developing maintenance programs that improve operability, reduce downtime, and balance lifecycle cost.
Analyze equipment performance, maintenance data, and operational trends to identify risks and improvement opportunities.
Support root cause analysis and corrective action development for reliability-related issues and recurring failures.
Partner with Site Operations to provide technical guidance during equipment failures, abnormal conditions, or troubleshooting efforts.
Review construction submittals, O&M documentation, and turnover materials to evaluate maintainability and operational readiness.
Support risk assessments, spare parts analysis, lifecycle planning, and end-of-useful-life considerations for critical assets.
Contribute feedback to Design Engineering teams on reliability, maintainability, and operating experience from live sites.
Help improve site response procedures, documentation quality, and repeatable reliability practices across the portfolio.

Requirements

3-5 years of experience in critical facilities, data center operations, industrial maintenance, commissioning, or reliability-related environments.
Working knowledge of mission-critical facility systems across electrical, mechanical, and/or controls domains.
Experience supporting maintenance planning, system testing, troubleshooting, or failure analysis in operational environments.
Bachelor's degree in Engineering or related field preferred; equivalent field experience also valued.

Skills and Competencies

Strong analytical and problem-solving capability.
Ability to work across multiple teams in a fast-paced environment.
Strong written and verbal communication skills.
Attention to detail and process discipline.
Ability to balance technical rigor with practical operational needs.

Preferred Skills / Certifications

Familiarity with CMMS, asset management systems, commissioning processes, or maintenance planning tools.
Exposure to RAM analysis, spare parts analysis, or lifecycle cost analysis.
Working knowledge of one-lines, P&IDs, sequences of operation, or controls architecture documentation.
Data center, utility, healthcare, semiconductor, telecom, or other uptime-critical experience is a plus.

Physical Demands / Work Environment This role supports mission-critical data center environments where reliability, responsiveness, and execution discipline are essential. Travel may be required to support site reviews, turnover activities, incident follow-up, and cross-functional coordination. You must be able to walk sites, climb stairs, and work safely in active operational environments, with or without reasonable accommodation. Source roles also note occasional lifting up to 25 pounds.

About the company

Oracle offers integrated suites of applications plus secure, autonomous infrastructure in the Oracle Cloud. For more information about Oracle (NYSE: ORCL), please visit us at www.oracle.com.

Our mission is to help people see data in new ways, discover insights, unlock endless possibilities.

Role details

Job location

Tech stack

Job description

Requirements

About the company

Apply for this position

Good distractions

Moments

Videos View all