AI Training Data Engineer
Liquidpixels, Inc.
Billerica, United States of America
2 months ago
Role details
Contract type
Internship / Graduate position Employment type
Full-time (> 32 hours) Working hours
Shift work Languages
EnglishJob location
Billerica, United States of America
Tech stack
Training Data
JavaScript
Artificial Intelligence
Automation of Tests
Information Systems
Information Engineering
Data Integrity
JSON
Python
Scripting (Bash/Python/Go/Ruby)
Large Language Models
Information Technology
REST
Job description
- Collect, catalog, and organize rendering data across our platform using purpose-built internal tools
- Coordinate with internal teams and stakeholders to track down assets, configurations, and technical resources needed to complete the dataset
- Verify and validate asset references to ensure data integrity and completeness
- Generate performance and diagnostic data for each dataset entry
- Run data through our automated sanitization and normalization pipeline
- Populate and maintain a searchable, tagged repository of validated configurations
- Flag edge cases and anomalies that help improve our automated QA tooling
- Collaborate directly with senior leadership on pipeline design and process improvements
What Makes This Different This isn't a "fetch coffee" internship. You'll be working alongside the team that invented dynamic imaging - people who have spent a quarter century pushing the boundaries of what's possible with real-time image rendering. You'll operate at the intersection of dynamic imaging, data engineering, and AI - three of the fastest-moving fields in tech. The dataset you help build will be used to train large language models to understand complex image rendering pipelines, something no one else in the industry is doing.
Requirements
- Pursuing a degree in Computer Science, Data Science, Information Systems, or a related field
- Excellent communication skills - you'll need to be proactive, persistent, and comfortable reaching out to people across the organization to get what you need
- Comfortable working in browser-based tools and navigating structured data
- Detail-oriented with strong organizational skills - this is precision work
- Familiarity with any of the following is a plus (but not required): JSON, REST APIs, image processing concepts, basic scripting (JavaScript or Python)
- Curious, self-directed, and not afraid to follow up until the job is done
About the company
About LiquidPixels LiquidPixels invented dynamic imaging. With 25 years of continuous innovation, our LiquiFire platform powers real-time image rendering for some of the world's most recognized luxury and enterprise brands - delivering solutions seen by billions of people every day.
Now we're building something new - an AI/ML training pipeline, and we need sharp, motivated talent to help make it happen.
The Role We're looking for a Data Pipeline & QA Intern to play a critical hands-on role in preparing a large-scale training dataset built from our proprietary rendering engine. This isn't busy work - you'll be directly contributing to a groundbreaking system in one of the fastest-growing markets in tech.