Software Engineer, Data Infrastructure & Acquisition - Jersey City, NJ, USA
Role details
Job location
Tech stack
Job description
The Software Engineer, Data Infrastructure & Acquisition is a key contributor to the AI team's data collection efforts that support model training operations. This role focuses on developing and maintaining scalable data ingestion pipelines and cloud infrastructure to enable the creation of high-quality datasets at petabyte scale. The position collaborates closely with scientists and leadership to optimize data quality and efficiency, directly impacting the development of next-generation consumer and enterprise AI products.
- Responsibilities:
-
Identify and acquire new sources of audio data for ingestion pipelines
-
Operate and enhance cloud infrastructure for data ingestion using GCP and Terraform
-
Collaborate with scientists to improve cost, throughput, and quality of data
Requirements
-
BS/MS/PhD in Computer Science or a related discipline
-
Minimum 5 years of software development experience
-
Proficiency with bash and Python scripting in Linux environments
-
Experience with Docker, Infrastructure-as-Code, and at least one major cloud provider (GCP preferred)
-
Knowledge of web crawlers and large-scale data processing workflows is a plus
-
Ability to manage multiple tasks and adapt to shifting priorities
-
Strong written and verbal communication skills
Benefits & conditions
- The United States base salary range for this full-time position is $140,000-$200,000 plus bonus and equity, depending on experience