AI Performance & Benchmarking Engineer
Key2Source INC
Charlotte, United States of America
yesterday
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
EnglishJob location
Charlotte, United States of America
Tech stack
Artificial Intelligence
Monitoring of Systems
Python
Load Testing
Openshift
Performance Tuning
Graphics Processing Unit (GPU)
Performance Testing
Large Language Models
Grafana
Kubernetes
Job description
We are seeking an experienced AI Performance & Benchmarking Engineer with strong expertise in LLM performance testing, benchmarking, and infrastructure optimization. The ideal candidate should have hands-on experience with GuideLLM, NVIDIA H200 GPUs, Locust, Kubernetes/OpenShift, and observability tools to evaluate and optimize AI workloads at scale., * Design and execute AI/LLM performance and load testing strategies
- Benchmark model performance on NVIDIA H200 infrastructure
- Develop scalable testing frameworks using Locust and Python
- Monitor system performance using observability tools
- Optimize AI workloads running on Kubernetes/OpenShift environments
- Analyze bottlenecks and provide performance tuning recommendations
Requirements
- GuideLLM
- NVIDIA H200
- Locust
- Performance Benchmarking
- Load Testing
- Kubernetes & OpenShift
- Python
- Observability & Monitoring Tools