Sr. Software Engineer - Data, Siri Speech

Apple Inc.
Cupertino, United States of America
10 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Cupertino, United States of America

Tech stack

Artificial Intelligence
Unit Testing
Continuous Integration
Information Engineering
Data Warehousing
Distributed Computing Environment
Distributed Data Store
Python
Machine Learning
Natural Language Processing
Software Engineering
Speech Recognition
Data Processing
Chatbots
Large Language Models
Spark
Siri
Backend
Information Technology
Dask

Job description

Want to join the team pushing the boundaries of AI and building an intelligent assistant that helps millions of people get things done? Join the Siri team at Apple. To build the best speech recognition and generation models, we need to use the latest technology in distributed training and the best available data. We combine those needs into one team and are focused on blurring the lines between traditional "data processing" and "model training". Efficiently training on petabytes of audio data pushes us to consider the entire training stack while developing new models to extract useful signals from unprecedented volumes of data., The Siri Speech team is looking for exceptional individuals to extend the core technology that let Siri understand, learn, and remember. You will be part of a cross-functional team consisting of software engineers as well as data and machine learning engineers/scientists and having a large impact on the Siri product. This is a rare opportunity to apply distributed data engineering techniques at the intersection of various areas such as speech recognition, natural language processing, and dialogue management.

In this role you will * Implement backend tools for Speech data warehouses including cataloging the entire collection of Speech Data * Automate speech data annotation that runs on a self-serve platform * Deploy and implement LLM-based chatbots to make the unified speech warehouse queryable and actionable (such as derived dataset creation) via natural language * Automate onboarding of new speech datasets from various sources onto a unified speech warehouse for easier discoverability and inclusion in training and evaluation of Siri * Collaborate with other Data and infrastructure teams across Apple to implement querying and speech dataset creation improvements

Requirements

  • Deep expertise in Python software development, CI/CD, unit and integration testing
  • Distributed data processing tools and frameworks (Beam, Spark, Dask, Ray)
  • Strong software engineering abilities in Python, * M.S. or Ph.D. degree in Computer Science, or equivalent experience
  • Strong data engineering background in speech and/or language/text/dialogue processing field
  • Speech and/or Machine Learning experience a plus
  • Real passion for building research demo data solution prototypes and turning them into production quality design/implementation
  • Strong interpersonal skills to work well with engineering teams
  • Excellent problem solving and critical thinking
  • Ability to work in a fast-paced environment with rapidly changing priorities
  • Passionate about building extraordinary products and experiences for our users

Apply for this position