Part-Time Data Ingest Engineer (contractor)

Digital Public Library of America, Inc.
Boston, United States of America
7 days ago

Role details

Contract type
Contract
Employment type
Part-time (≤ 32 hours)
Working hours
Shift work
Languages
English
Compensation
$ 312K

Job location

Boston, United States of America

Tech stack

API
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Apache HTTP Server
Continuous Integration
Elasticsearch
Github
JSON
PostgreSQL
Metadata
Metadata Standards
Scala
Scripting (Bash/Python/Go/Ruby)
Delivery Pipeline
Spark
Electronic Medical Records
Backend
Avro
REST
Terraform
Docker

Job description

DPLA is looking for a part-time contractor to coordinate and maintain metadata ingest operations. This position is directly involved in maintaining DPLA's ingestion process of harvesting, mapping, enriching, and indexing metadata from contributing partners.

What you'll be doing

  • Running monthly ingest cycles across active partner contributions (harvesting, mapping, enrichment, indexing)
  • Coordinating with DPLA staff on metadata mapping and delivery
  • Monitoring pipeline reliability and addressing bottlenecks or single points of failure
  • Troubleshooting ingestion errors and coordinating resolution with DPLA staff
  • Supporting deployments and maintaining CI/CD pipeline health
  • Providing regular status updates to DPLA staff

Technical environment

  • Pipeline: Scala, Apache Spark, Amazon EC2 and EMR, AWS S3, Apache Avro, Python scripts
  • Metadata: JSON-LD via DPLA MAP
  • APIs: Scala-based RESTful API on Elasticsearch 7, PostgreSQL auth backend
  • CI/CD: GitHub Actions, Docker, Terraform, AWS CodePipeline, * 10-20 hours/week, flexible scheduling
  • $75 - $150 hourly rate (commensurate with experience)
  • An initial 3-6 month fixed-term contract, commencing April 1, with the possibility of extension.
  • Independent contractor arrangement (W-9/1099)
  • Must be legally authorized to work in the United States without company sponsorship

Requirements

  • Hands-on experience with Spark/Scala pipelines and AWS (EC2, EMR, S3)
  • Familiarity with cultural heritage metadata standards (RDF, JSON-LD, Dublin Core, MODS, or similar) and DAMS (CONTENTdm, etc.)
  • Experience working across metadata quality, pipeline ops, and infrastructure
  • Familiarity with GitHub-based collaborative workflows
  • Self-directed

Apply for this position