Data Scientist (Infrastructure Diagnostics)
Role details
Job location
Tech stack
Job description
As a Data Scientist on this team, you are the "bridge" between raw infrastructure telemetry and actionable operational intelligence. You don't just see numbers; you possess a deep mechanical and electrical empathy that allows you to read a system's data as a doctor reads a patient's chart. You are highly inquisitive, approaching complex anomalies without bias to uncover true root causes.
This is not a "hide behind the keyboard" role. You thrive in a hands-off environment, treating your space with a sense of proactive ownership and treating your peers as fellow experts. You are a truth-teller who uses thorough, compassionate communication to persuade others and drive impact in the high-stakes world of data center uptime. We expect our Data Scientists to arrive ready to own their space and contribute to the team's collective success immediately.
- We are seeking a U.S.-based team member (Pacific Time Zone preferred) with flexibility to work hours that overlap with APAC time zones as needed., You will utilize our existing data ingestion and delivery platforms to "teach" our models to understand the physical world, filling a critical expertise gap in the data center industry.
- Multidisciplinary Diagnostic Analysis: Use telemetry tools to analyze sensor data across mechanical (chillers, pumps) and electrical (UPS, switchgear, power feeds) systems to identify "failure signatures" for our LLM-driven monitoring tool.
- Refining the Logic Engine: Act as a primary user of our platforms, identifying gaps in our current mechanisms and collaborating with Engineering to influence future features and data quality.
- Operational Insight Generation: Translate raw telemetry into the "SME-level" logic and directions used by our LLM tool to guide data center operators in real-time.
- SME Development: Cultivate deep domain expertise in all facets of data center infrastructure. You will be expected to master the nuances of both mechanical and electrical dependencies to ensure our product reflects operational reality.
- Customer Guidance: Move from shadowing peers to directly supporting customers, using our platform to provide clear, data-backed direction on complex problems.
- Model Validation: Oversee pilot projects to test how our AI-driven SME tool interprets real-world stressors, ensuring the output is operationally realistic, accurate, and actionable.
- Adaptability: Remain agile and proactive. As a member of a fast-moving team, you will encounter challenges and scopes not explicitly defined here; we expect you to lean in and solve them., * Familiarize yourself with the company handbook and team roadmap.
- Review existing system ontologies and sensor data structures across both mechanical and electrical domains.
- Shadow senior team members during customer diagnostic reviews to understand the "voice" of the SME.
In your first 60 days…
- Build full proficiency in our internal data tools and analysis workflows.
- Identify failure signatures in customer data with peer guidance and begin automating detection logic.
- Identify at least one gap in our current tooling and propose a logic-based solution to Engineering.
In your first 90 days…
- Provide direct guidance to customers on anomalies with peer support, moving toward full self-sufficiency.
- Contribute to the refinement of the LLM "instruction set" for cross-disciplinary diagnostics.
- Present a post-incident analysis correlating telemetry to a real-world root cause to the broader product team., We take a thoughtful and intentional approach to remote collaboration. Inspired by pioneers like GitLab, we embrace proven best practices to foster an exceptional remote work environment. Our culture is documentation-first, and we prioritize asynchronous communication to support focus and flexibility across time zones. While we value independence, we stay closely connected through tools like Slack and video conferencing. Weekly all-hands meetings help us align and build strong relationships, and we regularly host virtual team-building activities and social events to maintain a sense of camaraderie., We create intelligent control systems that maximize the performance of large industrial facilities.
Requirements
- Educational Background: Bachelor's degree in Mechanical Engineering, Electrical Engineering, Control Theory, or a related field that provides a foundation in physical systems and thermodynamics.
- Analytical Grit: A deep, innate interest in using data to diagnose how and why systems fail. You are a "tinkerer" who prefers solving real-world problems over theoretical research.
- Technical Proficiency: Strong Python skills and experience with data manipulation libraries (Pandas/NumPy) to perform custom analysis outside of standard tooling.
- Communication Mastery: Ability to explain complex diagnostic findings clearly and persuasively to both technical peers and non-domain stakeholders.
- Unbiased Problem-Solving: A proven ability to look at a problem without preconceived notions and figure out solutions either independently or via team collaboration.
- Alignment with Values: Demonstrated commitment to Transparency, Collaboration, and Ownership-especially in environments where reliability and learning from failure are paramount.
Preferred Skills & Experience
- Infrastructure Exposure: Experience with critical infrastructure components (HVAC, power distribution, or industrial automation).
- Industrial IoT: Experience with time-series data from industrial sensors (SCADA, BMS, Smart Meters).
- AI/LLM Curiosity: Exposure to or a strong interest in how LLMs can be used for root-cause analysis and automated reporting.
Benefits & conditions
US Residents:
- Tier 1 (Largest highest-cost metros): $119,200 - $163,900
- Tier 2 (Other major metros): $113,240 - $155,705
- Tier 3 (Mid-sized metro areas): $107,280 - $147,510
- Tier 4 (All other locations): $101,320 - $139,315
In addition to base salary, this position is eligible for equity. _Final salary will be determined based on several factors, including a candidate's qualifications, skills, competencies, experience, expertise, education and location. In some cases, final compensation may fall outside the posted range. Salary ranges are regularly reviewed and may be adjusted in response to market trends.
_Benefits & Perks
- Fast-paced, team-oriented environment where your work directly shapes the company's direction.
- We are a 100% remote company.
- Competitive compensation & meaningful equity.
- Outsized responsibilities & professional development.
- Training is foundational; functional, customer immersion, and development training.
- Medical, dental, and vision insurance (exact benefits vary by region).
- Unlimited paid time off, with a required minimum of 20 days per year.
- Paid parental leave (exact benefits vary by region).
- Flexible stipends to support your workspace, well-being, and continued professional development.
- Company MacBook.