Software Development Engineer, Neuron Foundation Tools
Role details
Job location
Tech stack
Job description
AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators and the Trn1 and Inf1 servers that use them.
As the Software Development Engineer for the Neuron Foundation Tools Team, you will be responsible for working alongside a team of engineers to develop and maintain high-performance monitoring and profiling tools for machine learning applications and AI accelerators. You will work on design, development, and deployment of the Neuron Profiler and other Neuron Tools. The profiler plays a crucial role to internal and external customers in optimizing AI workloads across hardware platforms such as Trainium and Inferentia devices, by providing deep insights into performance bottlenecks and system behavior. Improving performance of ML Kernels and ML Frameworks.
In this role, you will manage the full development life cycle of the Neuron Profiler/Tools toolchain, ensuring scalability, reliability, and usability. You will collaborate with cross-functional teams to ensure that the our C++ compiler and runtime generates key information so customers can understand and optimize the performance of our custom hardware. Additionally, you will drive innovations that allow the profiler to support multiple frameworks, such as PyTorch, JAX, and XLA.
A successful candidate will have an established background in building AI/ML and performance analysis tools. Experience with ML-specific profiler tools (like PyTorch Profiler or TensorFlow Profiler) is highly desirable, along with direct customer-facing experience and a strong motivation to achieve results.
A day in the life day in the life You will work with the executive leadership and other senior management and technical leaders to define product directions and deliver them to customers. We build massive-scale distributed training and inference solutions. This organization builds the full stack of software, servers and chips to accelerate at the highest scale.
About the team Inclusive Team Culture
Requirements
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language, * 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
Benefits & conditions
Pulled from the full job description
- AD&D insurance
- Parental leave
- Health insurance
- 401(k) matching
- Paid time off
- Vision insurance
- Dental insurance, The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits. USA, CA, Cupertino - 165,200.00 - 223,600.00 USD annually USA, WA, Seattle - 143,700.00 - 194,400.00 USD annually