Server Technician
Role details
Job location
Tech stack
Job description
We are seeking a highly skilled Server Technician to architect, manage, and scale our on-premise, GPU-accelerated clustered virtualization environment. You will be responsible for orchestrating dense clusters of Red Hat Enterprise Linux (RHEL) servers utilizing NVIDIA vGPU technologies to support high-density computing workloads.
As a Server Technician, your responsibilities will include, but may not be limited to, the following:
· Solving complex IT infrastructure and Server operational problems both independently and as part of a team.
· Serving as systems maintainer.
· Interacting with customers and stakeholders to solicit feedback, manage expectations on resources availability.
You will provide technical support to a tight-knit group of scientists, software developers, and engineers. Your unique background and experience will play directly into the work you are executing each week, and you will be continually learning new technologies while working with your teammates.
Requirements
Do you have experience in Vulnerability management?, Do you have a Bachelor's degree?, Are you motivated by developing software solutions that truly have an impact on safety and national security?, · B.S. in Information Technology, Computer Science, Computer Information Systems or Systems and Network Administrator
· Minimum of 5+ years professional experience with Server room operations and processes.
Required expert knowledge in the following:
· Clustered Virtualization: Deploy, configure, and maintain large-scale hypervisor clusters (e.g., Red Hat OpenShift Virtualization, VMware ESXi, or KVM) optimized for hardware acceleration.
· vGPU Orchestration: Manage NVIDIA vGPU software stacks to partition and allocate resources across multiple high-performance virtual machines.
· Cluster Lifecycle Management: Automate the provisioning, patching, and scaling of RHEL compute nodes.
· Performance Tuning: Optimize storage networking and kernel parameters to eliminate bottlenecks between the CPU, memory, and GPUs.
· High Availability & DR: Design and implement failover strategies, load balancing, and disaster recovery protocols for clustered GPU workloads.
· Network Management: Support the management of a local area network for the onsite hardware and VMs including supporting the configuration of firewall and switches. Experience with Sonicwall, Cisco preferred.
· Telemetry & Monitoring: Build and maintain advanced monitoring dashboards using tools like Prometheus, Grafana, and NVIDIA-SMI to track cluster health, vGPU utilization, and thermal metrics.
· Server Room: Physical hardware maintenance and configuration, proper cable process and labels, proper optimization of physical configuration, monitor cooling
· Hosted Services: Management and expansion of (Lightweight Directory Access Protocol (LDAP), Network File System (NFS), Domain Name System (DNS), etc…)
Working Knowledge of:
-
Hosted services management and expansion (Atlassian tools, Fortify, FreeIPA, Nexus, Large Language Models)
-
Vulnerability management processes
-
Incident response processes and documentation
-
System and log monitoring
· Eligibility (U.S. Citizenship) to undergo a background investigation for a United States Department of Defense security clearance.
· Effective communication skills (verbal and written)
Desired Skills & Experience:
· Active Secret clearance
· Mac system and application management
· Windows system and application management
· Android system and application management
· Cloud services management - Microsoft GCC High Cloud, Nessus
· 10 + years professional experience with Server room operations and processes.
We understand that candidates may not be able to check the boxes for all desired qualifications. What is most important is that candidates have exceptional problem-solving skills, creative out-of-the-box thinking, and are comfortable in an environment where you will be quickly learning, evaluating, and deploying new technologies.
Candidates must reside within commuting distance of the Louisville, Colorado office or willing to relocate to the area., * Server room operations and processes: 5 years (Required), * Server room operations and processes: 5 years (Preferred)
Benefits & conditions
Pulled from the full job description
- Professional development assistance
- Tuition reimbursement
- 401(k)
- Health insurance
- 401(k) matching
- Paid time off
- Dental insurance, * 401(k)
- 401(k) matching
- Dental insurance
- Health insurance
- Paid time off
- Professional development assistance
- Retirement plan
- Tuition reimbursement, * 401(k)
- 401(k) matching
- Dental insurance
- Health insurance
- Paid time off
- Tuition reimbursement
Application Question(s):
- Eligibility (U.S. Citizenship) to undergo a background investigation for a United States Department of Defense security clearance