Data Engineer

Gravity Hair Salon, LLC

Columbus, United States of America

3 months ago

Role details

Contract type

Temporary to permanent

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

$ 150K

Job location

Columbus, United States of America

Tech stack

Airflow

Azure

Google BigQuery

Clinical Data Repository

Cloud Computing

Data Transformation

Data Systems

Data Warehousing

Python

SQL Databases

Feature Engineering

Fast Healthcare Interoperability Resources

Large Language Models

Snowflake

Health Level Seven International

Redshift

Databricks

Job description

Design, build, and maintain scalable data pipelines for large, complex clinical datasets (EHR, pathology, genomics, etc.)? Implement and manage data transformations and analytics workflows using Databricks (Spark, Delta Lake)
Ingest, standardize, and harmonize healthcare data into OMOP Common Data Model
Partner with clinical, analytics, and ML teams to ensure data is reliable, well-documented, and fit for downstream use
Lead data quality, validation, and observability efforts for clinical data pipelines
Develop data models and schemas that support analytics, research, and ML use cases
Optimize performance, cost, and reliability across the data platform
Contribute to best practices around data governance, versioning, lineage, and reproducibility
Taking data analysis requirements from commercial customers and mapping to clinical variables from the OMOP, Epic, or other data models

Requirements

We are seeking a Senior Data Engineer with deep experience working with clinical and real-world healthcare data. This role will focus on building and scaling data pipelines that support analytics, research, and downstream machine learning use cases. The ideal candidate has hands-on experience with OMOP, Databricks, and modern data stacks, and understands the real-world challenges of clinical data harmonization across disparate sources., * 5+ years of experience as a Data Engineer, with significant experience in healthcare or life sciences

Strong hands-on experience with Databricks (Spark SQL, PySpark, Delta Lake)
Deep understanding of OMOP CDM, including:

Standard vocabularies (SNOMED, LOINC, RxNorm, ICD, CPT)
ETL patterns for clinical data mapping and normalization

Experience with clinical data harmonization, including:

Mapping heterogeneous source systems into a common schema
Managing missing, inconsistent, or conflicting clinical data
Understanding clinical workflows and data provenance.

Strong cloud experience, preferably in Azure, relating to items such as Data Factory and other data related tooling
Proficiency in Python and SQL
Experience with modern data stacks, including:

Cloud data warehouses or lakehouses (Databricks, Snowflake, BigQuery, Redshift)
Orchestration tools (Airflow, Dagster, Prefect)
Data transformation frameworks (dbt or equivalent)

Strong data modeling and analytics engineering skills

Preferred / Nice-to-Have

Experience working with real-world evidence (RWE), clinical research, or regulatory-facing datasets
Familiarity with ML or feature engineering pipelines built on clinical data
Experience supporting downstream LLM, NLP, or ML workloads using healthcare data
Knowledge of healthcare data standards beyond OMOP (FHIR, HL7)
Experience operating data systems in HIPAA-compliant environments

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all