Field Hardware Engineer, HPC
Role details
Job location
Tech stack
Job description
-
Direct impact on scale: you'll restore service on complex incidents and raise the bar on reliability as we grow.
-
Enable breakthrough AI: your work unlocks science & engineering teams to deliver state-of-the-art AI.
What you will do
-
Lead complex interventions: plan and execute vendor-level or multi-node operations (e.g., full rack work, intricate recabling, post-restart diagnosis), own risk assessment/rollback, and coordinate with vendors (RMA/escalations).
-
Advanced diagnostics: correlate symptoms across compute, storage, interconnect, cooling; read system indicators (LED/POST/beep), BMC/IPMI consoles, and logs to identify root causes.
-
Guide and uplift L1s: coach on safe practices (ESD/LOTO), first-line triage, rack craftsmanship, documentation quality; pair on tricky procedures. (No people management.)
-
Process & automation: improve SOPs/checklists; propose/build small automation (Python/Bash) for photo/serial capture, inventory sync, dashboards/alerts; shorten MTTR.
-
Safety & compliance: enforce lockout/tagout, ESD, PPE; ensure audit-ready tickets, evidence and change traces.
-
Parts & logistics (advanced): plan spares strategy, track failure trends, and drive proactive vendor actions.
Requirements
-
5+ years in data center/server hardware or L2/L3 hardware support, with proven complex hands-on work in production (HPC/AI/Cloud at scale).
-
End-to-end hardware expertise: comfortable across CPU/memory/PCIe cards (incl. accelerators), NICs, PSUs, drives, network, power and cooling (including DLC); strong judgment on when/how to escalate.
-
Diagnostics depth: confident in analyzing BMC/IPMI logs, linux software logs and crashes simple CLI checks; methodical root cause analysis.
-
Safety & discipline: impeccable ESD/LOTO/PPE habits; zero rough handling; clean, labeled, auditable work.
-
Communication & mentoring: crisp status/handovers; able to coach L1s during live operations. Provide technical documentations to L1s or other team
-
Mobility: willing to travel between sites (Paris area or nearby regions, occasionally in Europe or US)).
Nice to have
-
Vendor tools (iDRAC/iLO/IPMI), RAID/storage basics (NVMe/SAS/SATA), high-speed interconnect (Ethernet/InfiniBand).
-
Coding/automation (Python/Bash) for small ops tools and reporting.
-
Experience with ticketing (Jira/ServiceNow), inventory/RMA flows, vendor coordination.
Location & Remote
The position is based in our Paris HQ offices and we encourage going to the office as much as we can (at least 3 days per week) to create bonds and smooth communication. Our remote policy aims to provide flexibility, improve work-life balance and increase productivity. Each manager can decide the amount of days worked remotely based on autonomy and a specific context (e.g. more flexibility can occur during summer). In any case, employees are expected to maintain regular communication with their teams and be available during core working hours.
Benefits & conditions
Competitive salary and equity package
️ Health insurance
Transportation allowance
Sport allowance
Meal vouchers
Private pension plan
Generous parental leave policy