Middleware Development Engineer
Role details
Job location
Tech stack
Job description
Join Intel's Communication Runtimes team to shape the future of High-Performance Computing (HPC) and Artificial Intelligence (AI). As a Middleware Development Engineer, you will design, build, and optimize software communication libraries that enable breakthrough scientific discoveries and machine learning innovations at scale. This role offers the opportunity to work alongside top scientists and engineers on projects that drive impactful advancements in climate modeling, drug discovery, and AI systems.
You will contribute to the success of the new Argonne AI Center Of Excellence, a partnership between Intel and Argonne National Labs. This role shall focus on identifying performance bottlenecks and any functional limitations found inside Intel's oneCCL collective communications library when running key AI workloads at large scale. After identifying key issues that impact functionality and performance, you will help design and implement performance optimizations and other improvements in the oneCCL library that maximize the utility of the system.
You will work as part of a larger team that develops other next-generation communication libraries, such as Intel SHMEM and Intel MPI, ensuring exceptional performance in distributed computing environments. Your innovative work will optimize performance across Intel's cutting-edge GPUs and CPUs, empowering HPC and AI applications to achieve low latency, high bandwidth, and maximum reliability.
Key Responsibilities
- Identify performance bottlenecks and additional features necessary to run Argonne AI COE workloads.
- Optimize runtime software for distributed computing systems, ensuring optimal latency and bandwidth.
- Collaborate with cross-functional teams to define technical specifications and software requirements.
- Troubleshoot and resolve complex issues across multiple hardware and software stack layers.
- Contribute to software innovations that enhance HPC and AI capabilities at unprecedented scale.
- Partner with engineering and architecture teams to maximize performance on Intel architectures., At the Data Center Group (DCG), we're committed to delivering exceptional products and delighting our customers. We offer both broad-market Xeon-based solutions and custom x86-based products, ensuring tailored innovation for diverse needs across general-purpose compute, web services, HPC, and AI-accelerated systems. Our charter encompasses defining business strategy and roadmaps, product management, developing ecosystems and business opportunities, delivering strong financial performance, and reinvigorating x86 leadership. Join us as we transform the data center segment through workload driven leadership products and close collaboration with our partners.
Requirements
You must possess the below minimum qualifications to be initially considered for this position. Preferred qualifications are in addition to the requirements and are considered a plus factor in identifying top candidates., * Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, Mathematics, or STEM-related field with 3+ yrs. of experience in software development. OR
- Master's degree in Computer Science, Computer Engineering, Electrical Engineering, Mathematics, or STEM-related field with 1+ yrs. of experience in software development. OR
- Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, Mathematics, or STEM-related field with 3+ months of experience in software development.
- 3+ years of experience in at least one of the following:
- Distributed computing systems.
- HPC communication libraries (examples being: MPI, SHMEM, or oneCCL/NCCL).
- GPU software development.
- Network communication stack development., * Advanced degree (Master's or PhD) in Computer Science, Computer Engineering, Electrical Engineering, Mathematics, or STEM-related field.
- Proficiency in C and C++ programming.
- Experience developing in Linux environments.
- Background in multithreaded programming.
- Experience in runtime performance optimization, improving communications latency or throughput.
- Background in developing software for GPUs and collective communication libraries.
- Strong analytical skills and ability to solve complex software challenges.
- Passion for driving meaningful advancements in scientific computing.
Benefits & conditions
We offer a total compensation package that ranks among the best in the industry. It consists of competitive pay, stock bonuses, and benefit programs which include health, retirement, and vacation. Find out more about the benefits of working at Intel (https://intel.wd1.myworkdayjobs.com/External/page/1025c144664a100150b4b1665c750003) .
Annual Salary Range for jobs which could be performed in the US: $111,030.00-211,200.00 USD