Principal Data Engineer - AI

Anaplan

Philadelphia, United States of America

8 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Philadelphia, United States of America

Tech stack

Artificial Intelligence

Airflow

Amazon Web Services (AWS)

Apache HTTP Server

Azure

Big Data

Google BigQuery

Cloud Database

Code Review

Databases

Continuous Integration

Data Architecture

Information Engineering

Data Governance

Data Infrastructure

Data Integrity

ETL

Data Transformation

Data Systems

Data Warehousing

Distributed File Systems

Digital Assets

Distributed Computing Environment

Distributed Systems

Hadoop

Python

Message Broker

NoSQL

DataOps

Anaplan

Software Engineering

Data Streaming

Web Services

Software Organization

Cloud Platform System

Data Ingestion

Sql Optimization

Snowflake

Spark

Build Management

Data Lake

Infrastructure Automation Frameworks

Low Latency

Apache Flink

Kafka

Data Pipelines

Redshift

Databricks

Job description

We're seeking a Principal Data Engineer who can work across the full stack of Anaplan's data platform, setting the technical direction for how we ingest, transform, store, serve, and govern data at scale. You will build highly performant, robust data pipelines that process massive volumes of data in real-time and batch. This foundational work empowers business users to leverage vast datasets in their planning workflows and forms the bedrock for our advanced analytics and AI initiatives. You'll need deep knowledge of distributed computing, data architecture, and strong software engineering skills to tackle complex, high-scale data challenges. This role is open to candidates located in the Eastern or Central time zones. Employees who live within commuting distance of one of our offices will be expected to work onsite two days per week as part of our hybrid work model Your Impact

Lead the data architecture, design, and deployment of scalable, high-throughput Big Data systems into production environments.
Architect, deploy, and manage the foundational data systems that underlie modern AI infrastructure, including vector, NoSQL, and document databases.
Develop end-to-end data engineering solutions, including robust ETL/ELT pipelines, API services, and data ingestion frameworks.
Design and build the storage and processing layers powering our analytics workloads: data lakes, data warehouses, distributed file systems, and real-time streaming architectures.
Engineer feature-rich context pipelines that process large-scale enterprise data, balancing batch and streaming patterns seamlessly.
Optimize and scale large distributed queries and data transformations to ensure high performance and low latency for end users.
Implement data quality frameworks to measure and ensure data integrity, reliability, and governance across all data assets.
Collaborate with analytics, product, and platform teams to build data models that capture the semantics of customer metrics, hierarchies, and relationships.
Stay current with the modern data stack and big data landscape, evaluating new tools, distributed computing frameworks, and database technologies for potential adoption.

Requirements

Extensive data engineering experience, demonstrating a strong track record of hands-on execution and delivery in complex data environments.
Deep practical understanding of the database ecosystems that power AI and machine learning infrastructure (e.g., Vector databases, NoSQL, and Document stores).
Hands-on experience building, scaling, and shipping large-scale data platforms in production.
Deep practical experience with distributed data processing frameworks (e.g., Apache Spark, Flink, Hadoop).
Strong expertise in message brokers and event streaming platforms (e.g., Apache Kafka, Kinesis).
End-to-end exposure to data pipeline lifecycle development, including extensive experience with workflow orchestration tools (e.g., Apache Airflow, Dagster).
Hands-on expertise with cloud data warehouses (e.g., Snowflake, BigQuery, Redshift) and data lake architectures (e.g., Databricks, Delta Lake, Apache Iceberg).
Advanced SQL skills and proficiency in Python.
Strong background in modern software development practices (testing, code review, CI/CD, Infrastructure as Code).

Desirable

Extensive, progressive experience leading technical projects and mentoring engineering teams.
Hands-on experience with cloud-native infrastructure (AWS, GCP, or Azure).
Experience implementing data observability, monitoring, and alerting frameworks at scale.
Familiarity with Anaplan or similar enterprise planning platforms.

About the company

At Anaplan, we are a team of innovators focused on optimizing business decision-making through our leading AI-infused scenario planning and analysis platform so our customers can outpace their competition and the market. What unites Anaplanners across teams and geographies is our collective commitment to our customers' success and to our Winning Culture. Our customers rank among the who's who in the Fortune 50. Coca-Cola, LinkedIn, Adobe, LVMH and Bayer are just a few of the 2,400+ global companies who rely on our best-in-class platform. Our Winning Culture is the engine that drives our teams of innovators. We champion diversity of thought and ideas, we behave like leaders regardless of title, we are committed to achieving ambitious goals, and we love celebrating our wins - big and small. Supported by operating principles of being strategy-led, values-based and disciplined in execution, you'll be inspired, connected, developed and rewarded here. Everything that makes you unique is welcome; join us and let's build what's next - together!

Role details

Job location

Tech stack

Job description

Requirements

About the company

Apply for this position

Good distractions

Moments

Videos View all