Performance Research Engineer (multiple levels)

Efficient Computer
San Jose, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Compensation
$ 250K

Job location

San Jose, United States of America

Tech stack

Artificial Intelligence
Audio Signal Processing
C++
Profiling
Nvidia CUDA
Computer Engineering
Software Debugging
Data Flow Control
Machine Learning
Reduced Instruction Set Computing
Signal Processing
Software Engineering
System Programming
Video Editing
Parallel Computation
Information Technology

Job description

We are seeking Performance Research Engineers (staff to principal levels) to join our growing team. Efficient's Performance Research Engineers research new optimization techniques; design new tools for AI assisted optimization and performance analysis; collaborate with our architecture team on the design of future hardware; design and perform modeling experiments using our architecture simulator; and collaborate with our world class compiler team, evaluating code-generation quality, suggesting new intrinsics, influencing the ISA, and experimenting with new language extensions. This is an applied research role, where you will be asked to drive the integration of these new techniques, tools, and language extensions into our existing performance libraries.

Requirements

  • Hands-on software development experience working closely with hardware, including exposure to at least two RISC, DSP or GPU platforms.
  • A passion for understanding and addressing performance issues that are unique to our Fabric dataflow architecture.
  • Experience with framework and library design, particularly within resource constrained and realtime environments.
  • Experience with CUDA, HIP and/or other parallel programming models.
  • The ability to lead and work independently.
  • A collaborative spirit, with the ability to work with and influence multiple engineering teams.
  • Demonstrated ability to write, debug, and maintain low-level, C/C++, systems-level code as well as design clean interfaces and modular code.
  • Actively uses AI tools to generate, optimize, and debug code.
  • Familiarity with low-level programming interfaces, e.g. PTX, LLVM IR and/or MLIR.
  • Experience working with HW simulation environments.
  • Domain expertise in three or more of the following areas: Linear Algebra, ML, Image Processing, Video Processing, Signal Processing, Audio Processing, SDR, realtime programming, or Robotics.
  • Background in performance profiling, benchmark design, or comparative hardware analysis.
  • Excellent written, verbal, analytical and technical communication skills, with the ability to clearly document complex systems, lead discussions across teams, as well as the ability to drive consensus across teams.
  • Minimum Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or a related technical field. PHD preferred. Equivalent work experience., + Some experience working on compiler development.
  • Some experience working with high-performing HW architecture teams.

Benefits & conditions

Pulled from the full job description

  • Paid parental leave
  • Parental leave
  • 401(k) matching, We offer a competitive salary for this role, generally ranging from $180,000 to $250,000, along with meaningful equity and comprehensive benefits. The final compensation package will be based on your experience and location, with some flexibility to ensure we align with the right candidate.

Why Join Efficient?

Efficient offers a competitive compensation and benefits package, including 401K match, company-paid benefits, equity program, paid parental leave, and flexibility. We are committed to personal and professional development and strive to grow together as people and as a company.

About the company

Efficient is developing the world's most energy-efficient general-purpose computer processor. Efficient's patented technology uses 100x less energy than state of the art commercially available ultra-low-power processors and is programmable using standard high-level programming languages and AI/ML frameworks. This level of efficiency makes perpetual, pervasive intelligence possible: run AI/ML continuously on a AA battery for 5-10 years. Our platform's unprecedented level of efficiency enables IoT devices to intelligently capture and curate first-party data to drive the next major computing revolution

Apply for this position