Software Engineer, Data Infrastructure & Acquisition - Ithaca, NY, USA
Role details
Job location
Tech stack
Job description
The Software Engineer, Data Infrastructure & Acquisition is responsible for managing all aspects of data collection that support AI model training. This role focuses on developing and maintaining scalable, cost-effective data pipelines and infrastructure to enable the creation of high-quality datasets at petabyte scale. The position plays a key role in collaborating with data scientists and leadership to develop data strategies that power next-generation AI products, contributing directly to the organization's ability to innovate and deliver impactful text-to-speech solutions.
- Responsibilities:
-
Identify and integrate new audio data sources into the ingestion pipeline
-
Operate and enhance cloud infrastructure for data ingestion, primarily on GCP using Terraform
-
Collaborate with scientists to improve data quality, scale, and cost efficiency
Requirements
-
BS, MS, or PhD in Computer Science or related field
-
5+ years of software development experience
-
Proficiency in bash and Python scripting in Linux environments
-
Experience with Docker, Infrastructure-as-Code, and at least one major cloud platform (preferably GCP)
-
Familiarity with web crawlers and large-scale data processing workflows is a plus
-
Ability to manage multiple priorities and adapt to change
-
Strong written and verbal communication skills
Benefits & conditions
- Pay Range and Compensation Package:
- United States base salary range: $140,000 to $200,000 plus bonus and equity, depending on experience
Equal Opportunity Statement: Our client is an equal opportunity employer. They celebrate diversity and are committed to creating an inclusive environment for all employees. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, or national origin.
Note