ETL Data Engineer (Healthcare)

Ritwik Infotech
Atlanta, United States of America
7 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Atlanta, United States of America

Tech stack

API
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
ASC X12 Standards
Apache HTTP Server
Big Data
Computer Programming
Databases
Continuous Integration
Directed Acyclic Graph (Directed Graphs)
Data Architecture
Data Cleansing
Information Engineering
Data Governance
ETL
Data Warehousing
Amazon DynamoDB
UN Electronic Data Interchange for Administration Commerce and Transport
Github
Identity and Access Management
JSON
Python
PostgreSQL
Performance Tuning
Query Optimization
DataOps
SQL Stored Procedures
SQL Databases
XML
Data Processing
Fast Healthcare Interoperability Resources
Delivery Pipeline
Indexer
Data Lake
PySpark
Data Lineage
Amazon Web Services (AWS)
Cloudwatch
Data Pipelines
Redshift

Job description

We are looking for a Senior Data ETL Engineer with strong expertise in AWS-based data engineering and healthcare claims processing. This role involves designing and managing large-scale, HIPAA-compliant data pipelines handling millions to hundreds of millions of claims records.

The candidate will act as a technical leader, working closely with analytics, clinical, compliance, and product teams., Claims Data Processing

  • Process and validate EDI 837 transactions at scale
  • Handle complete claims lifecycle workflows
  • Work with multi-source healthcare data ingestion

ETL & Data Architecture

  • Build scalable AWS Glue pipelines
  • Design Iceberg-based data lakes
  • Optimize Redshift data warehouse performance

Data Engineering

  • Design and manage DynamoDB & PostgreSQL systems
  • Optimize queries for large-scale datasets

Orchestration & Automation

  • Build and maintain Airflow DAGs
  • Implement CI/CD pipelines and automation

Data Quality & Governance

  • Ensure data accuracy, lineage, and auditability
  • Maintain compliance with healthcare regulations

Requirements

  • Strong experience with ANSI X12 EDI transactions: 837P, 837I, 837D

  • Knowledge of full claims lifecycle:

  • 835 (ERA), 270/271 (Eligibility), 276/277 (Claim Status)

Experience with:

  • ICD-10, CPT, HCPCS, NPI, Revenue Codes

Understanding of HIPAA 5010 compliance

Experience handling large-scale claims data (millions+), * AWS Glue (PySpark ETL pipelines)

  • Amazon Redshift (data warehousing & performance tuning)
  • Amazon Athena
  • Amazon S3 & Lake Formation

Experience with:

  • Apache Iceberg (schema evolution, partitioning, time travel)
  • Amazon Kinesis (streaming ingestion)
  • AWS Step Functions / Lambda

Programming & ETL

  • Strong in Python / PySpark

  • Experience building ETL/ELT pipelines at scale

  • Handling multi-format data:

  • EDI, JSON, CSV, XML, APIs, HL7 FHIR


Databases & SQL

  • Expert-level SQL:

  • Joins, CTEs, window functions, query optimization

Hands-on experience with:

  • Amazon DynamoDB (GSI/LSI, single-table design)
  • PostgreSQL (partitioning, indexing, stored procedures)

Orchestration & DataOps

  • Apache Airflow (MWAA) DAG development

  • dbt transformations, testing, modeling

  • CI/CD tools:

  • GitHub Actions / AWS CodePipeline, * Data quality tools (Great Expectations / AWS Deequ)

  • Data lineage & monitoring (CloudWatch, SNS)

Strong knowledge of:

  • HIPAA / HITECH compliance
  • Encryption (KMS), IAM access control

Apply for this position