Director, Compute Platforms & AI
Role details
Job location
Tech stack
Job description
We are seeking an experienced and hands-on Director to lead the strategy, architecture, and operational excellence of compute platforms across cloud, on-premises, and high-performance computing (HPC) environments. This role owns the direction and evolution of infrastructure supporting modern workloads, including distributed systems, development infrastructure, observability, and the practical adoption of generative AI capabilities across the organization.
Reporting to the Chief Information & Security Officer, this individual will serve as the senior technical leader for compute and AI within IT. The ideal candidate brings deep cross-domain expertise, strong operational judgment, and a track record of building, scaling, and stabilizing complex systems.
This role combines strategic leadership with hands-on technical engagement and cross-functional influence, enabling both current operations and future capabilities., * Define and drive the strategy and architecture for compute platforms spanning AWS, Azure, and on-prem/HPC environments, including GPU-accelerated systems.
- Lead the evaluation and adoption of generative AI tools, identifying high-impact opportunities across engineering, IT, and business workflows.
- Partner with stakeholders to pilot and scale practical generative AI use cases, focusing on productivity, automation, and workflow improvement.
- Oversee and improve core cloud and HPC capabilities, including development infrastructure, observability, and overall system reliability.
- Drive performance optimization, capacity planning, and operational excellence in collaboration with cross-functional stakeholders.
- Lead resolution of complex system issues, including root-cause analysis and long-term remediation.
- Establish and enforce infrastructure standards, best practices, and technical direction across teams.
- Provide technical leadership, mentorship, and guidance to engineers and technical leads; build and scale a high-performing team as needed.
Requirements
- 10+ years of experience designing, building, and operating complex infrastructure systems across cloud and/or on-prem environments.
- Strong experience with development infrastructure, including CI/CD systems, artifact management, and developer workflows.
- Deep experience with observability systems, including metrics, logging, tracing, and alerting.
- Experience evaluating and implementing generative AI tools (e.g., LLM-based systems, copilots, automation agents) in real-world environments.
- Strong understanding of the operational, security, and cost considerations of generative AI adoption in an enterprise setting.
- Deep systems-level expertise in Linux, networking, storage, and distributed systems.
- Demonstrated ability to lead complex technical initiatives and drive outcomes across teams.
- Track record of building and leading high-performing teams in demanding environments.
- Strong communication and leadership skills, with the ability to influence both technical and executive stakeholders.
Benefits & conditions
The ranges below reflect the target ranges for a new hire base salary. One is for the Bay Area (within 50 miles of HQ, Palo Alto), the second one (if applicable) is for elsewhere in the US (beyond 50 miles of HQ, Palo Alto). If there is only one range, it is for the specific location of where the position will be located.Actual compensation may vary outside of these ranges and is dependent on various factors including but not limited to a candidate's qualifications including relevant education and training, competencies, experience, geographic location, and business needs. Base pay is only one part of the total compensation package. Full time roles are eligible for equity and benefits. Base pay is subject to change and may be modified in the future. U.S. Base Pay Range $185,000-$215,000 USD Bay Area Pay Range $215,000-$255,000 USD