Performance Research Engineer (multiple levels)

Efficient Computer

San Jose, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Compensation

$ 250K

Job location

San Jose, United States of America

Tech stack

Artificial Intelligence

Audio Signal Processing

C++

Profiling

Nvidia CUDA

Computer Engineering

Software Debugging

Data Flow Control

Machine Learning

Reduced Instruction Set Computing

Signal Processing

Software Engineering

System Programming

Video Editing

Parallel Computation

Information Technology

Job description

We are seeking Performance Research Engineers (staff to principal levels) to join our growing team. Efficient's Performance Research Engineers research new optimization techniques; design new tools for AI assisted optimization and performance analysis; collaborate with our architecture team on the design of future hardware; design and perform modeling experiments using our architecture simulator; and collaborate with our world class compiler team, evaluating code-generation quality, suggesting new intrinsics, influencing the ISA, and experimenting with new language extensions. This is an applied research role, where you will be asked to drive the integration of these new techniques, tools, and language extensions into our existing performance libraries.

Requirements

Hands-on software development experience working closely with hardware, including exposure to at least two RISC, DSP or GPU platforms.
A passion for understanding and addressing performance issues that are unique to our Fabric dataflow architecture.
Experience with framework and library design, particularly within resource constrained and realtime environments.
Experience with CUDA, HIP and/or other parallel programming models.
The ability to lead and work independently.
A collaborative spirit, with the ability to work with and influence multiple engineering teams.
Demonstrated ability to write, debug, and maintain low-level, C/C++, systems-level code as well as design clean interfaces and modular code.
Actively uses AI tools to generate, optimize, and debug code.
Familiarity with low-level programming interfaces, e.g. PTX, LLVM IR and/or MLIR.
Experience working with HW simulation environments.
Domain expertise in three or more of the following areas: Linear Algebra, ML, Image Processing, Video Processing, Signal Processing, Audio Processing, SDR, realtime programming, or Robotics.
Background in performance profiling, benchmark design, or comparative hardware analysis.
Excellent written, verbal, analytical and technical communication skills, with the ability to clearly document complex systems, lead discussions across teams, as well as the ability to drive consensus across teams.
Minimum Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or a related technical field. PHD preferred. Equivalent work experience., + Some experience working on compiler development.

Some experience working with high-performing HW architecture teams.

Benefits & conditions

Pulled from the full job description

Paid parental leave
Parental leave
401(k) matching, We offer a competitive salary for this role, generally ranging from $180,000 to $250,000, along with meaningful equity and comprehensive benefits. The final compensation package will be based on your experience and location, with some flexibility to ensure we align with the right candidate.

Why Join Efficient?

Efficient offers a competitive compensation and benefits package, including 401K match, company-paid benefits, equity program, paid parental leave, and flexibility. We are committed to personal and professional development and strive to grow together as people and as a company.

About the company

Efficient is developing the world's most energy-efficient general-purpose computer processor. Efficient's patented technology uses 100x less energy than state of the art commercially available ultra-low-power processors and is programmable using standard high-level programming languages and AI/ML frameworks. This level of efficiency makes perpetual, pervasive intelligence possible: run AI/ML continuously on a AA battery for 5-10 years. Our platform's unprecedented level of efficiency enables IoT devices to intelligently capture and curate first-party data to drive the next major computing revolution

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

About the company

Apply for this position

Good distractions

Moments

Videos View all