Data Science Engineer

Columbia University
New York, United States of America
yesterday

Role details

Contract type
Temporary contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Junior
Compensation
$ 180K

Job location

Remote
New York, United States of America

Tech stack

Business Analytics Applications
Health Informatics
Clinical Data Repository
Computer Programming
Computer Networks
Relational Databases
Distributed Data Store
Machine Learning
Open Source Technology
Standard Sql
SQL Databases
Electronic Medical Records
Information Technology

Job description

The Department of Biomedical Informatics at Columbia University is seeking a highly motivated data science engineer to support large-scale observational research within the OHDSI (Observational Health Data Sciences and Informatics) network. This role will focus on the design, implementation, and execution of distributed network studies using electronic health record (EHR) and administrative claims data to generate real-world evidence.

The successful candidate will contribute to characterization, population-level estimation (causal inference), and patient-level prediction analyses across multi-institutional data networks. This position offers a unique opportunity to work at the intersection of biomedical informatics, data science, and clinical research within a leading academic medical center.

This position is a full-time two-year position with a possibility of an extension, contingent on available funding., * Design and implement observational network studies using distributed EHR and administrative claims data

  • Conduct large-scale characterization, comparative effectiveness and safety estimation, and patient-level prediction analyses
  • Develop reproducible analytic pipelines using R and SQL in relational database environments
  • Apply and evaluate methods from causal inference (e.g., confounding control, bias assessment, sensitivity analyses)
  • Apply machine learning approaches for predictive modeling using high-dimensional healthcare data
  • Work with standardized data representations, including the OMOP Common Data Model and standardized clinical vocabularies for conditions, drugs, procedures, and measurements
  • Collaborate with interdisciplinary teams including clinicians, statisticians, data engineers, and informaticians
  • Contribute to scholarly outputs including manuscripts, presentations, and open-source analytic tools
  • Support transparent, reproducible, and scalable research practices across distributed data networks

Requirements

Master?s degree in biostatistics, public health, epidemiology, informatics, computer science, or related field, and or equivalent in education and experience, with at least 2 years? related experience.

  • At least 1 year of relevant prior work experience in the healthcare industry within a health system, a pharmaceutical company, or an insurer
  • Strong programming experience in R and SQL
  • Experience working with relational databases and large-scale healthcare datasets
  • Demonstrated interest in observational research using real-world clinical or claims data
  • Ability to design, implement, and document reproducible analytic workflows
  • Strong written and verbal communication skills, PhD degree in biostatistics, public health, epidemiology, informatics, computer science, or related field, and/or equivalent and experience in education, with at least 1 year of related work experience., * Familiarity with the OMOP Common Data Model and standardized vocabularies (e.g., ICD, NDC, SNOMED, MedDRA, LOINC, CPT)
  • Knowledge of causal inference methods for observational studies
  • Experience with machine learning techniques for patient-level prediction
  • Prior experience working in distributed or federated data networks
  • Familiarity with open-source research ecosystems and collaborative scientific communities

About the company

This position is based at Columbia University?s Department of Biomedical Informatics, a world leader in clinical research informatics and observational health data science. The role offers close collaboration with leading researchers, access to large-scale real-world data, and opportunities to contribute to impactful, open, and methodologically rigorous research that informs clinical and policy decision-making. This position is based in New York, NY and has the option to follow a hybrid schedule of 3 days per week working on site and 2 days per week working remotely.

Apply for this position