AI and Deep Learning Architect - Model Compression and Quantization

Kneron, Inc.
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Tech stack

Algorithm Design
Computer Vision
C++
Nvidia CUDA
Computer Programming
Machine Learning
OpenCV
OpenCL
TensorFlow
Software Engineering
PyTorch
Caffe
Deep Learning
Parallel Computation
Information Technology

Job description

  1. Research and develop state-of-the-art model compression techniques including QAT, model distillation, pruning, quantization, model binarization, and others for deep learning models.
  2. Implementing novel deep neural network architectures and developing advanced training algorithms to support model structure training, auto pruning and low-bit quantization.
  3. Apply and optimize model compression and quantization technique to variety of models in computer vision applications, audio applications, and others.
  4. Research and optimize model compression and quantization technique for Kneron AI accelerator and jointly optimize hardware architecture for compressed model.
  5. Intern or full time.

Requirements

  1. M.S./PhD in Computer Science, Machine Learning, Mathematics or similar field (Ph.D. is preferred)
  2. 3+ years of industry/academia experience with deep learning algorithm development and optimization.
  3. 3-5 years of software engineering experience in an academic or industrial setting.
  4. Research experience on any model compression and model quantization technique including model distillation, pruning, post train quantization, quantization aware retrain, model binarization, and NAS.
  5. Experience on model accuracy loss analysis for model compression and quantization is a strong plus. Noise modeling and noise analysis are strong plus.
  6. Strong experience in C/C++ programing is a plus.
  7. Hands-on experience in computer vision and deep learning frameworks, e.g., OpenCV, Tensorflow, Keras, Pytorch, and Caffe.
  8. Ability to quickly adapt to new situations, learn new technologies, and collaborate and communicate effectively.
  9. Experience with parallel computing, GPU/CUDA, DSP, and OpenCL programming is a plus.
  10. Top-tier conference publication records, including but not limited to CVPR, ICCV, ECCV, NIPS, ICML, are strong plus.

Apply for this position