Senior HPC and AI Cluster Administrator

Accenture
Tampa, United States of America
6 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 195K

Job location

Tampa, United States of America

Tech stack

Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Computing Platforms
Azure
Bash
Ubuntu (Operating System)
Continuous Integration
Dynamic Host Configuration Protocol
Linux
DNS
Ethernet
General Parallel File Systems
Hyper-V
InfiniBand
Job Scheduling
Python
Kernel-Based Virtual Machine
Routing
Network Protocols
Remote Direct Memory Access
Red Hat Enterprise Linux - RHEL
Transmission Control Protocol (TCP)
Workflow Management Systems
AI Infrastructure
Private Cloud Environment
Network Switches
Delivery Pipeline
Kubernetes
Infrastructure Automation Frameworks
Storage Technologies
Information Technology
Bare Metal
Slurm
ZFS File System
Docker
VMware

Job description

  • Design, Deploy, and maintain HPC/AI clusters

  • Manage AI jobs workflows using various scheduling technology, such as Kubernetes.

  • Support and maintain continuous integration and delivery pipelines

  • Troubleshooting and fixing, bottom up from bare metal, operating system, software stack and application level

  • Support Research, Development and Operational activities.

Requirements

Do you have experience in Network switching?, * Bachelor's Degree in Computer Science, Engineering, or a related field; or equivalent experience

  • 5 years of experience in any of the following:
  • Knowledge of HPC and AI solution technologies to include hardware, hypervisors, CPU's and GPU's.
  • Experience with job scheduling workloads and orchestration tools such as Slurm & K8s
  • Excellent knowledge of Linux (i.e. Redhat, Ubuntu) networking (Routing, Switching) and internals, ACLs and OS level security protection and common protocols e.g. TCP, DHCP, DNS, etc.
  • Experience with multiple storage solutions such as Lustre, GPFS, zfs and xfs. Familiarity with newer and emerging storage technologies.
  • Automation and configuration management tools such as Python, Bash within a Gitops workflows.
  • Knowledge of Networking Protocols like InfiniBand, Ethernet
  • Experience with private cloud platforms (for example VMware, Hyper-V, KVM)
  • Familiarity with public cloud computing platforms (e.g. AWS, Azure)
  • Must possess and maintain required DoD 8140 certifications.

Ways to stand out from the crowd:

  • Knowledge of GPU architectures, time-slicing, Multi-instance GPU (MIG)

  • Experience with container orchestration technologies i.e. Kubernetes, Docker

  • Experience designing, deploying AI workflow technologies such as Apache Airflow, Prefect, Dagster.

  • Background with RDMA (InfiniBand or RoCE) fabrics

  • Experience working in regulated industries and applying compliance requirements (i.e. DISA STIG, CIS etc.)

  • NVIDIA Certifications (AI Infrastructure, AI Operations, AI networking)

  • VMWARE Certifications (Certified Professional / Advanced Professional)

Clearance

  • An active TS/SCI federal security clearance is required

Benefits & conditions

3.73.7 out of 5 stars Tampa, FL $118,300 - $195,100 a year

About the company

At Accenture Federal Services, nothing matters more than helping the US federal government make the nation stronger and safer and life better for people. Our 13,000+ people are united in a shared purpose to pursue the limitless potential of technology and ingenuity for clients across defense, national security, public safety, civilian, and military health organizations. Join Accenture Federal Services, a technology company within global Accenture. Recognized as a Glassdoor Top 100 Best Place to Work, we offer a collaborative and caring community where you feel like you belong and are empowered to grow, learn and thrive through hands-on experience, certifications, industry training and more. Join us to drive positive, lasting change that moves missions and the government forward! AFS is looking for a Senior HPC and AI Cluster Administrator to support software and data solutions for our customers. We are integrating supercomputers and AI clusters based on existing technologies. We are looking for a system administrator to be a key player to enable artificial intelligence and GPU computing solutions. You will work with many scientific researchers, developers, and customers to create improved workflows and develop unique solutions. You will interact with HPC, OS, GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms., What We Believe As a company wholly dedicated to serving the US federal government, we bring together the best talent to help reinvent how federal agencies operate and deliver greater value for their mission and the American people. We have an unwavering commitment to creating a culture in which all our people are respected, feel a sense of belonging, and have equal opportunity. As a business imperative, every person at Accenture Federal Services has the responsibility to create and sustain a culture where everyone feels welcomed and included. This is grounded in our core values and our experience that hiring and developing great people who reflect different perspectives, experiences, and backgrounds is key to driving innovation and delivering the results that our clients and the country count on. Equal Employment Opportunity Statement We believe that no one should be discriminated against because of their differences. All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law. Our rich diversity makes us more innovative, more competitive, and more creative, which helps us better serve our clients and our communities. For details, view a copy of the Accenture Federal Services Equal Opportunity Policy Statement. Accenture Federal Services is an Equal Employment Opportunity employer. Additionally, as an Affirmative Action Employer for Veterans and Individuals with Disabilities, Accenture Federal Services is committed to providing veteran employment opportunities to our service men and women. Requesting An Accommodation Accenture Federal Services is committed to providing equal employment opportunities for persons with disabilities or religious observances, including reasonable accommodation when needed. If you are hired by Accenture Federal Services and require accommodation to perform the essential functions of your role, you will be asked to participate in our reasonable accommodation process. Accommodations made to facilitate the recruiting process are not a guarantee of future or continued accommodations once hired. If you are being considered for employment opportunities with Accenture Federal Services and need an accommodation for a disability or religious observance during the interview process or for the job you are interviewing for, please speak with your recruiter. Other Employment Statements Applicants for employment in the US must have work authorization that does not now or in the future require sponsorship of a visa for employment authorization in the United States. Candidates who are currently employed by a client of Accenture Federal Services or an affiliated Accenture business may not be eligible for consideration. Job candidates will not be obligated to disclose sealed or expunged records of conviction or arrest as part of the hiring process. The Company will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant. Additionally, employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information, unless the disclosure is (a) in response to a formal complaint or charge, (b) in furtherance of an investigation, proceeding, hearing, or action, including an investigation conducted by the employer, or (c) consistent with the Company's legal duty to furnish information.

Apply for this position