Part-Time Data Ingest Engineer (contractor)

Digital Public Library of America, Inc.

Boston, United States of America

1 month ago

Role details

Contract type

Contract

Employment type

Part-time (≤ 32 hours)

Working hours

Shift work

Languages

English

Compensation

$ 312K

Job location

Boston, United States of America

Tech stack

API

Amazon Web Services (AWS)

Apache HTTP Server

Continuous Integration

Elasticsearch

Github

JSON

PostgreSQL

Metadata

Metadata Standards

Scala

Scripting (Bash/Python/Go/Ruby)

Delivery Pipeline

Spark

Electronic Medical Records

Backend

Avro

REST

Terraform

Docker

Job description

DPLA is looking for a part-time contractor to coordinate and maintain metadata ingest operations. This position is directly involved in maintaining DPLA's ingestion process of harvesting, mapping, enriching, and indexing metadata from contributing partners.

What you'll be doing

Running monthly ingest cycles across active partner contributions (harvesting, mapping, enrichment, indexing)
Coordinating with DPLA staff on metadata mapping and delivery
Monitoring pipeline reliability and addressing bottlenecks or single points of failure
Troubleshooting ingestion errors and coordinating resolution with DPLA staff
Supporting deployments and maintaining CI/CD pipeline health
Providing regular status updates to DPLA staff

Technical environment

Pipeline: Scala, Apache Spark, Amazon EC2 and EMR, AWS S3, Apache Avro, Python scripts
Metadata: JSON-LD via DPLA MAP
APIs: Scala-based RESTful API on Elasticsearch 7, PostgreSQL auth backend
CI/CD: GitHub Actions, Docker, Terraform, AWS CodePipeline, * 10-20 hours/week, flexible scheduling
$75 - $150 hourly rate (commensurate with experience)
An initial 3-6 month fixed-term contract, commencing April 1, with the possibility of extension.
Independent contractor arrangement (W-9/1099)
Must be legally authorized to work in the United States without company sponsorship

Requirements

Hands-on experience with Spark/Scala pipelines and AWS (EC2, EMR, S3)
Familiarity with cultural heritage metadata standards (RDF, JSON-LD, Dublin Core, MODS, or similar) and DAMS (CONTENTdm, etc.)
Experience working across metadata quality, pipeline ops, and infrastructure
Familiarity with GitHub-based collaborative workflows
Self-directed

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all