Principal Engineer, Data Infrastructure

New York Times

Hope, United States of America

7 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Shift work

Languages

English

Experience level

Senior

Compensation

$ 220K

Job location

Remote

Hope, United States of America

Tech stack

Java

API

Artificial Intelligence

Airflow

Amazon Web Services (AWS)

Apache HTTP Server

Audit Trail

Big Data

Cloud Computing

Computer Programming

Data Infrastructure

ETL

Data Stores

Distributed Systems

Amazon DynamoDB

Python

Machine Learning

Azure

Software Engineering

Workflow Management Systems

Feature Engineering

Data Ingestion

Large Language Models

Spark

Reliability of Systems

Data Lake

Kubernetes

Apache Flink

Amazon Web Services (AWS)

Google BigQuery

Integration Frameworks

Kafka

Build Tools

Data Management

Machine Learning Operations

Stream Processing

Confluent

Job description

We are seeking a Principal Software Engineer to lead the architecture and evolution of our data and machine learning infrastructure. This role will shape the foundation on which data-driven products, analytics, and AI applications are built. You will design systems that enable large-scale data processing, reliable pipelines, and efficient machine learning development-from feature engineering to real-time model serving.

As a principal engineer, you will partner with product, data science, and platform teams to set technical direction, drive adoption of reusable frameworks, and mentor engineers across the organization. You will ensure that both data and ML platforms are scalable, reliable, cost-efficient, and compliant with privacy and governance standards.

The core of the Data Platform is a data lake on AWS S3 with Apache Iceberg as the table format to ensure reliability. Data ingestion is standardized through Confluent Kafka for real-time streaming and Fivetran for ingestion of files and change-data. The transformation layer is decoupled from storage, using Apache Flink for stream processing, AWS Glue (Spark) for core ETL , and dbt/Athena for building analytical data models. The platform serves data through fit-for-purpose data stores, including Amazon DynamoDB for low-latency applications and Google BigQuery as the primary engine for analytics and BI.

You will report to the Sr. Director of Engineering. This role can be remote in the US, with a preference for candidates in the New York City area.

Responsibilities:

Architect & Build Platform: Design and evolve infrastructure for data ingestion, storage, batch and streaming pipelines, and machine learning workflows
Enable ML at Scale: Build systems for training, deploying, monitoring, and governing models, including feature stores, registries, and inference platforms
Reliability & Observability: Ensure end-to-end system reliability, monitoring, and cost transparency across data and ML workloads
Self-Service Platforms: Deliver frameworks and APIs that enable engineers, analysts, and ML scientists to build and operate solutions independently
Innovation & Standards: Evaluate and introduce emerging technologies (vector databases, distributed training, orchestration frameworks, LLM stacks) and establish adoption guidelines
Cross-Functional Leadership: Partner with platform, product, and engineering and ML science leaders to align on strategy and accelerate delivery
Mentorship & Influence: Guide senior and staff engineers, lead architecture reviews, and raise the technical bar across data and ML domains
Demonstrate support and understanding of our value of journalistic independence (https://www.nytco.com/company/mission-and-values/) and a strong commitment to our mission to seek the truth and help people understand the world

Requirements

10+ years of software engineering experience with a focus on distributed systems, data platforms, and ML infrastructure or equivalent
Proven ability to influence technical direction across multiple teams and mentor senior/staff engineers
Proven expertise in data processing frameworks and table formats (e.g. Spark, Flink, Iceberg) and orchestration tools (e.g. Airflow, Kubeflow)
Deep knowledge of ML infrastructure: model training pipelines, feature stores, registries, serving, and monitoring
Strong programming skills in Python and at least one compiled language like Java or Go
Experience designing systems with scalability, reliability, and cost-efficiency as first-class concerns
Cloud platform experience (AWS, GCP), familiarity with Kubernetes and modern data platform architectures

Preferred Qualifications:

Familiarity with compliance and governance in data/ML systems (auditability, privacy, explainability)
Familiarity with the data lakehouse paradigm and medallion architecture

This role requires limited on-call hours. An on-call schedule will be determined when you join, taking into account team size and other variables.

Benefits & conditions

$198,000 - $220,000 USD

For roles in the U.S., dependent on your role, you may be eligible for variable pay, such as an annual bonus and restricted stock. Benefits may include medical, dental and vision benefits, Flexible Spending Accounts (F.S.A.s), a company-matching 401(k) plan, paid vacation, paid sick days, paid parental leave, tuition reimbursement and professional development programs.

For roles outside of the U.S., information on benefits will be provided during the interview process.

The New York Times Company is committed to being the world's best source of independent, reliable and quality journalism. To do so, we embrace a diverse workforce that has a broad range of backgrounds and experiences across our ranks, at all levels of the organization. We encourage people from all backgrounds to apply.

About the company

The mission (https://www.nytco.com/company/mission-and-values/) of The New York Times is to seek the truth and help people understand the world. That means independent journalism is at the heart of all we do as a company. It's why we have a world-renowned newsroom that sends journalists to report on the ground from nearly 160 countries. It's why we focus deeply on how our readers will experience our journalism, from print to audio to a world-class digital and app destination. And it's why our business strategy centers on making journalism so good that it's worth paying for.