Senior Data Engineer

Procore
West, United States of America
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 194K

Job location

West, United States of America

Tech stack

Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Apache HTTP Server
Code Review
Continuous Integration
Data as a Services
Data Architecture
Information Engineering
ETL
Python
Machine Learning
Project Management Software
Performance Tuning
Software Engineering
Systems Integration
Flask
Large Language Models
Spark
Gitlab
FastAPI
Data Lake
Information Technology
REST
Databricks

Job description

We are looking for a Senior Data Engineer to join Procore Data team. In this role, you will be responsible for building the data architecture that connects Procore's global ecosystem. You will work across diverse domains to create a unified, high-fidelity view of our customers, projects, and users.

This is a "Data Engineering first" role that leverages AI and Machine Learning to solve complex Entity Resolution challenges. You will use the modern data stack to transform fragmented data from across the enterprise into a cohesive, intelligent data foundation that powers our global strategy., * Develop and maintain scalable ETL pipelines using Apache Spark. You will implement partitioning strategies and apply performance tuning techniques while utilizing modern open table formats like Delta Lake and Apache Iceberg to manage data consistency

  • As a Senior Data Engineer, you will drive engineering excellence through code reviews, mentorship, and the implementation of CI/CD best practices for data.
  • Develop and deploy AI/ML models and probabilistic matching logic to link and deduplicate entities across disparate business domains.
  • Design canonical data models that provide a 360-degree view of the enterprise, ensuring that a "Customer" in Sales matches the "Customer" in our Product and Marketing engines.
  • Implement AI-driven workflows to automatically clean, normalize, and enrich enterprise records, ensuring that our customers are working with the most accurate information possible.
  • Architect complex, modular data transformations, ensuring that the "logic layer" of our data stack is robust, testable, and highly performant.
  • Manage sophisticated, multi-stage workflows in Airflow, integrating Python-based scripts directly into the data lifecycle.

Requirements

  • Bachelor's degree in Computer Science or a similar technical field of study
  • 4+ years of technical experience in a Data or Software Engineering role.
  • Ability to write complex analytical queries and production-grade Python code.
  • Strong experience with Databricks, Airflow, Spark, AWS, Gitlab.
  • Experience developing lightweight data services using Python frameworks (e.g., FastAPI, Flask) and integrating with external REST APIs. You understand how to handle authentication, rate limiting, and robust error handling.
  • Practical experience using AI techniques (e.g., record linkage, fuzzy matching, or LLM-based classification) to solve data quality and identity problems.

Benefits & conditions

140,960.00 - 193,820.00 USD Annual For Los Angeles County (unincorporated) Candidates:

Procore will consider for employment all qualified applicants, including those with arrest or conviction records, in accordance with the requirements of applicable federal, state, and local laws, including the City of Los Angeles' Fair Chance Initiative for Hiring Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act.

Apply for this position