Systems Engineer (HPC/Server Farm)

STAFFING TECHNOLOGIES

San Jose, United States of America

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

San Jose, United States of America

Tech stack

Amazon Web Services (AWS)

Data analysis

Bash

Cloud Computing

Linux

Perl

Python

Software Engineering

Scripting (Bash/Python/Go/Ruby)

Information Technology

Physical Design

Job description

Seeking a Server Farm Engineer to join our team to support, manage, and improve the compute farm environment. The candidate should have hands-on experience with cloud solutions and proven expertise in working directly with R&D software development teams to develop solutions to optimize their working environment collaboratively., * Supporting multiple geological locations to serve user communities across North America, Europe, and Asia sites.

Focusing on improving R&D productivity and committing to customer success.
Driving the overall operational strategy for internal High-Performance Compute (HPC) farms in all locations.
Developing and executing the three-year compute roadmap and planning annual capacity growth for on-premises server farm in San Jose.
Operating, managing, and enhancing the internal compute farm and associated cloud (AWS).
Maintaining, enhancing, monitoring, reporting, and improving its efficiency.

Requirements

30-year history of applying leading-edge optimization and analysis algorithms to highly complex problems in semiconductor and electronic design, verification, and analysis. We are looking for a recent graduate software engineer to join our team of collaborative EDA professionals to deliver the best-in-class next-generation software for physical IC applications. The software engineer will work on complex problems where data analysis requires an evaluation of intangible variance factors to develop leading-edge software for the physical design and verification of products at advanced nodes., * 8+ years of technical experience architecting, managing, and improving a compute farm environment running Linux.

At least 5 years of direct hands-on experience in a global or regional compute farm and/or hybrid cloud environment consisting of 1,000 or more servers with some remote direct reports
At least 3 years working in a global group, coordinating support, strategies, projects, and operations across multiple geographies in a team-oriented approach
Extensive technical experience managing IBM LSF and RTM and scripting using Python, shell, Perl, etc., in a Farm environment and knowledge of LSF spanning Farm to Cloud is highly desirable
Solid understanding and proven operational experience with compute farms, job submission/management technologies, cloud, and associated management tools.
Proven experience working directly with R&D software development teams to collaboratively develop solutions to optimize their working environment (Direct EDA experience desired)
Proven experience in capacity and performance management, optimizing performance, ensuring adequate capacity, working with R&D on optimization of their workloads, and development and maintenance of key performance indicators
A proven process focus shown through documentation, change management, incident management and problem-resolution activities

Education: BS / MS in computer science or related field

About the company

Each day with offers exciting opportunities to create a better, more connected world. We are leading the charge to solve technology's toughest challenges. Working here means working alongside the industry's brightest people and innovating for the biggest, most innovative companies around the globe.

Role details

Job location

Tech stack

Job description

Requirements

About the company

Apply for this position

Good distractions

Moments

Videos View all