Data Scientist I

University of Florida
Gainesville, United States of America
27 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Junior
Compensation
$ 75K

Job location

Gainesville, United States of America

Tech stack

Artificial Intelligence
Data analysis
Computing Platforms
Bioinformatics
Data Infrastructure
Data Integrity
Data Security
Python
High Performance Computing
Data Ingestion
Parallel Computation
Containerization
Dask
Data Management
Machine Learning Operations
Data Pipelines

Job description

  1. Design, implement, and maintain scalable data ingestion and processing pipelines supporting the NeuroEnclave within UF's HIPAA-aligned computing environment.

  2. Develop and maintain data validation, profiling, and quality control workflows to ensure data integrity, provenance, and reproducibility across datasets.

  3. Engineer and optimize high-performance data workflows for large-scale biomedical datasets using Python-based tools and parallel computing frameworks.

  4. Standardize and harmonize heterogeneous data formats to support integrated analytics, AI/ML workflows, and cross-dataset interoperability.

  5. Implement technical controls supporting IRB-, HIPAA-, and NIH-compliant data access, including containerized environments, access controls, and audit-ready workflows. EXPECTED SALARY

Requirements

A Bachelor's Degree in data science, statistics, bioinformatics, analytics, or similar field and two years of experience; Master's Degree in data science, statistics, bioinformatics, analytics, or similar field., A Bachelor's Degree in data science, statistics, bioinformatics, analytics, or similar field and two years of experience; Master's Degree in data science, statistics, bioinformatics, analytics, or similar field. PREFERRED:

Experience working with clinical or biomedical research data. Familiarity with high-performance computing (HPC) or secure research computing environments. Experience with parallel computing frameworks (e.g., Dask or similar). Knowledge of data security, privacy, and compliance considerations (HIPAA, IRB, NIH Data Management & Sharing requirements). Experience supporting data infrastructure for AI/ML or advanced analytics. Prior experience in a research or academic data environment. SPECIAL INSTRUCTIONS TO APPLICANTS

Apply for this position