AI Engineer
Role details
Job location
Tech stack
Job description
We are seeking a skilled and innovative AI Engineer with hands-on experience in building and optimizing voice models. In this role, you will work on developing, training, and refining AI models for voice synthesis, voice cloning, speech recognition, and/or voice transformation. Your work will contribute to cutting-edge applications in conversational AI, voice assistants, and generative audio., Developed and optimized text-to-speech models that achieved human-like voice synthesis, maintaining the unique style of voice actors across multiple languages.
-
Implemented real-time processing solutions that reduced inference time to under 1 second, enhancing user interaction and experience.
-
Managed large-scale datasets for voice cloning projects, ensuring high performance and reliability while supporting multilingual transcriptions.
Key Responsibilities
-
Design, develop, and fine-tune deep learning models for voice synthesis (e.g., TTS, voice cloning).
-
Implement and optimize neural network architectures such as Tacotron, FastSpeech, WaveNet, or similar.
-
Collect, preprocess, and augment speech datasets.
-
Collaborate with product and engineering teams to integrate voice models into production systems.
-
Perform evaluation and quality assurance of voice model outputs.
-
Research and stay current on advancements in speech processing, audio generation, and machine learning.
Requirements
Role Description: The AI Engineer must have 3+ years of experience. Fir this role, you must be a strong AI Engineer with experience in assisting with the AI Voice project., Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field.
-
Strong experience with Python and machine learning libraries (e.g., PyTorch, TensorFlow).
-
Hands-on experience with speech/audio processing and relevant toolkits (e.g., Librosa, ESPnet, Kaldi).
-
Familiarity with voice model architectures (TTS, ASR, vocoders).
-
Understanding of deep learning concepts and model training processes.
Preferred Qualifications:
-
Experience with deploying models to real-time applications or mobile devices.
-
Knowledge of data labeling, voice dataset creation, and noise handling techniques.
-
Experience with cloud-based AI/ML infrastructure (e.g., AWS, GCP).
-
Contributions to open-source projects or published papers in speech/voice-related domains.
Education: Bachelor's degree
Experience: Minimum 3+ years of experience
Benefits & conditions
We have various coverages and additional benefits to choose from:
-
Medical, Dental (Including Ortho) & Vision Insurance (Option to Enroll).
-
Paid Leaves (Wherever applicable).