Data Science Engineer

Columbia University

New York, United States of America

yesterday

Role details

Contract type

Temporary contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Junior

Compensation

$ 180K

Job location

Remote

New York, United States of America

Tech stack

Business Analytics Applications

Health Informatics

Clinical Data Repository

Computer Programming

Computer Networks

Relational Databases

Distributed Data Store

Machine Learning

Open Source Technology

Standard Sql

SQL Databases

Electronic Medical Records

Information Technology

Job description

The Department of Biomedical Informatics at Columbia University is seeking a highly motivated data science engineer to support large-scale observational research within the OHDSI (Observational Health Data Sciences and Informatics) network. This role will focus on the design, implementation, and execution of distributed network studies using electronic health record (EHR) and administrative claims data to generate real-world evidence.

The successful candidate will contribute to characterization, population-level estimation (causal inference), and patient-level prediction analyses across multi-institutional data networks. This position offers a unique opportunity to work at the intersection of biomedical informatics, data science, and clinical research within a leading academic medical center.

This position is a full-time two-year position with a possibility of an extension, contingent on available funding., * Design and implement observational network studies using distributed EHR and administrative claims data

Conduct large-scale characterization, comparative effectiveness and safety estimation, and patient-level prediction analyses
Develop reproducible analytic pipelines using R and SQL in relational database environments
Apply and evaluate methods from causal inference (e.g., confounding control, bias assessment, sensitivity analyses)
Apply machine learning approaches for predictive modeling using high-dimensional healthcare data
Work with standardized data representations, including the OMOP Common Data Model and standardized clinical vocabularies for conditions, drugs, procedures, and measurements
Collaborate with interdisciplinary teams including clinicians, statisticians, data engineers, and informaticians
Contribute to scholarly outputs including manuscripts, presentations, and open-source analytic tools
Support transparent, reproducible, and scalable research practices across distributed data networks

Requirements

Master?s degree in biostatistics, public health, epidemiology, informatics, computer science, or related field, and or equivalent in education and experience, with at least 2 years? related experience.

At least 1 year of relevant prior work experience in the healthcare industry within a health system, a pharmaceutical company, or an insurer
Strong programming experience in R and SQL
Experience working with relational databases and large-scale healthcare datasets
Demonstrated interest in observational research using real-world clinical or claims data
Ability to design, implement, and document reproducible analytic workflows
Strong written and verbal communication skills, PhD degree in biostatistics, public health, epidemiology, informatics, computer science, or related field, and/or equivalent and experience in education, with at least 1 year of related work experience., * Familiarity with the OMOP Common Data Model and standardized vocabularies (e.g., ICD, NDC, SNOMED, MedDRA, LOINC, CPT)
Knowledge of causal inference methods for observational studies
Experience with machine learning techniques for patient-level prediction
Prior experience working in distributed or federated data networks
Familiarity with open-source research ecosystems and collaborative scientific communities

About the company

This position is based at Columbia University?s Department of Biomedical Informatics, a world leader in clinical research informatics and observational health data science. The role offers close collaboration with leading researchers, access to large-scale real-world data, and opportunities to contribute to impactful, open, and methodologically rigorous research that informs clinical and policy decision-making. This position is based in New York, NY and has the option to follow a hybrid schedule of 3 days per week working on site and 2 days per week working remotely.

Role details

Job location

Tech stack

Job description

Requirements

About the company

Apply for this position

Good distractions

Moments

Videos View all