Data Engineer
Role details
Job location
Tech stack
Job description
We are looking for a Data Engineer to build the data infrastructure powering analytics, business intelligence, operational monitoring, and AI-driven decision-making across a healthcare automation platform. You will design and maintain data pipelines, integrate fragmented healthcare and operational data sources, model complex workflows into clean data structures, and create the analytics foundation that helps the company measure, monitor, and optimize its systems. This role is ideal for someone who enjoys owning the full data stack: ingestion, transformation, modeling, analytics, alerting, and infrastructure decisions. Over time, you will have the opportunity to take end-to-end ownership of the data platform and shape the technical direction of the company's data architecture. What You'll Do Design, build, and maintain reliable data pipelines across internal systems, third-party APIs, and healthcare data sources. Integrate complex data sources such as operational systems, healthcare records, billing workflows, claims-related data, and external feeds. Build clean, scalable data models for revenue cycle workflows, intake operations, clinical processes, and business reporting. Develop analytics-ready schemas that support dashboards, reporting, monitoring, and operational decision-making. Build dashboards, alerts, and analytical tools to identify automation opportunities and operational bottlenecks. Support product, engineering, operations, and leadership teams with high-quality data infrastructure. Contribute to the technical direction of the business intelligence and data platform. Improve data quality, reliability, observability, and performance across the stack. Help create the foundation for AI/ML-driven workflows, analytics, and automation., You will build foundational data infrastructure rather than simply maintain legacy systems. Your work will directly shape how an AI healthcare platform measures and improves operations. You will work with complex, messy, high-value healthcare data. You will partner closely with product, engineering, and operations rather than sit in a silo. You will have the opportunity to define the long-term data architecture for a fast-scaling platform. You will help create analytics and automation systems that reduce administrative waste in healthcare. Work Model Full-time role. Hybrid in New York, with regular in-office collaboration expected. Visa sponsorship is not available for this position.
Requirements
3+ years of experience building and maintaining production data pipelines, warehouses, and analytics infrastructure. Strong SQL skills. Experience with at least one modern data warehouse or lakehouse such as Databricks, Redshift, Snowflake, BigQuery, or similar. Experience with relational databases such as Postgres or MySQL. Experience building ETL or ELT pipelines using tools such as Airflow, dbt, Dagster, or similar orchestration frameworks. Comfort working across the full data lifecycle, from raw ingestion through analytics-ready models. Ability to model complex operational domains into clean, queryable schemas. Comfort with exploratory analysis, statistical thinking, and analytical problem-solving. Experience using Python or R for data analysis. Strong communication skills and ability to explain technical data concepts to non-technical stakeholders. Ability to partner effectively with product, engineering, and operational teams. High ownership, detail orientation, and comfort operating in a fast-paced startup environment. Nice to Have Experience in healthcare technology or healthcare operations. Familiarity with healthcare data standards or data types such as EDI, HL7, FHIR, claims data, clinical documentation, or provider workflows. Experience with real-time or streaming data pipelines. Background in data science, machine learning, or feature engineering for AI/ML systems. Exposure to BI tools such as Hex, Sigma, or similar. Experience working in a high-growth startup or scale-up environment. Familiarity with cloud infrastructure and tooling such as AWS, Terraform, or similar.