Senior Data Engineer

Insight Global
Upper Providence Township, United States of America
31 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Upper Providence Township, United States of America

Tech stack

Artificial Intelligence
Big Data
Google BigQuery
Databases
Data Architecture
Information Engineering
Data Transformation
Data Structures
Data Systems
Python
PostgreSQL
OnyX for Mac
Data Management
Machine Learning Operations
Data Pipelines

Job description

The Senior Data Engineer will design, build, and deliver a new enterprise data product supporting the clients generative drug design and computational chemistry platforms. This role focuses on creating scalable, well-structured data architecture from the ground up, with long-term expansion and downstream AI/ML integration in mind. The ideal candidate combines strong data engineering expertise with an understanding of drug design, chemistry, and scientific data workflows.

-Design and implement a new enterprise data product, initially scoped as a standalone deliverable with future integration into broader AI-driven drug discovery platforms.

-Build scalable data pipelines, schemas, and storage models capable of supporting large, complex scientific and chemistry-derived datasets.

-Develop data solutions primarily on GCP / BigQuery, adhering to enterprise data engineering templates and standards.

-Implement data transformations and pipelines using Python, with a focus on data quality, traceability, and performance.

-Ensure the data architecture supports future expansion, additional datasets, and evolving analytical and computational needs.

-Collaborate closely with computational chemists, data scientists, and ML engineers to ensure data models align with generative design, molecular representations, and ML outputs.

-Apply an understanding of drug design and chemistry concepts (e.g., molecular properties, structure-activity data, experimental outputs) to inform data modeling and integration decisions.

-Provide technical guidance on data structure, scalability, and long-term maintainability in an enterprise environment.

Requirements

Strong experience in data engineering, including database, schema, and data product design.

-Hands-on experience with GCP and BigQuery (Postgres familiarity a plus).

-Proficiency in Python for building and maintaining data pipelines.

-Onyx background

-Experience working with large, complex datasets at scale, ideally in scientific or R&D contexts.

-Background in life sciences, pharma, or scientific data platforms. -Experience supporting downstream analytics, ML pipelines, or AI-driven platforms, particularly in R&D or discovery environments.

-Background in life sciences, pharma, or scientific data platforms.

-Working knowledge or hands-on exposure to drug design, chemistry, or computational chemistry data.

Apply for this position