Software Engineer, Data Infrastructure & Acquisition - Phoenix, AZ, USA
Role details
Job location
Tech stack
Job description
The Software Engineer, Data Infrastructure & Acquisition is responsible for managing and enhancing data collection processes that fuel AI model training. This role contributes by sourcing and ingesting large volumes of audio data, optimizing cloud infrastructure, and collaborating to improve data quality and cost efficiency. The position plays a key part in shaping the dataset roadmap to advance next-generation AI products.
- Responsibilities:
-
Identify and acquire new audio data sources for ingestion.
-
Operate and develop cloud infrastructure for data ingestion pipelines on GCP using Terraform.
-
Collaborate with scientists to optimize cost, throughput, and data quality.
-
Work with the AI team and leadership to define the dataset strategy.
Requirements
-
BS/MS/PhD in Computer Science or related field.
-
Over 5 years of software development experience.
-
Proficient in bash/Python scripting within Linux environments.
-
Experience with Docker, Infrastructure-as-Code, and major cloud platforms (GCP preferred).
-
Knowledge of web crawlers and large-scale data processing is a plus.
-
Ability to manage multiple priorities and adapt as needed.
-
Strong verbal and written communication skills.
Benefits & conditions
- The United States base salary range for this full-time position is $140,000-$200,000 plus bonus and equity, depending on experience.