Machine Learning Engineer

University of California, San Francisco
San Francisco, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

San Francisco, United States of America

Tech stack

API
Artificial Intelligence
Amazon Web Services (AWS)
Data analysis
Automation of Tests
Azure
Cloud Computing
Cloud Engineering
Configuration Management
Databases
Computer Engineering
Continuous Integration
Data Infrastructure
Data Integration
ETL
Data Structures
Data Visualization
Data Warehousing
Software Debugging
DevOps
Python
PostgreSQL
Machine Learning
Microsoft SQL Server
NumPy
Azure
SciPy
Software Engineering
SQL Databases
Tableau
Apex Code
Data Processing
Google Cloud Platform
Cloud Platform System
High Performance Computing
PyTorch
Large Language Models
Electronic Medical Records
Generative AI
Jupyter
Pandas
Scikit Learn
Integration Tests
Information Technology
Build Tools
Epic Clarity
Machine Learning Operations
Epic Caboodle
Software Version Control
Data Pipelines

Job description

The Machine Learning and Data engineer role will lead the development, implementation, and maintenance of data pipelines and infrastructure to support the deployment and continuous monitoring of Machine Learning (ML) and generative Artificial Intelligence (AI) tools within UCSF's APeX Enabled Research (AER) team. Most projects will be in partnership with other UCSF technical teams and involve highly customized research solutions. Communication skills and inventive technical solutioning are crucial.

The AER team provides a large array of services to the UCSF Research community, including project consultation, grant support, budget estimations, and project implementation and support. Project examples include:

  • Development of EHR-based interventions via clinical trials embedded within healthcare delivery systems to generate scientific evidence while delivering healthcare.
  • Enabling UCSF researchers with algorithms, digital tools and / or clinical interventions with strong evidence of feasibility and acceptability.
  • Develop technical approaches and budgets in order to implement these tools within the electronic medical record.
  • Supporting the development of scalable, low cost infrastructure to enable ongoing research.

This role primarily involves managing and optimizing the data and monitoring pipelines of the Health IT Platform for Advanced Computing (HIPAC), a cloud infrastructure that supports the development and deployment of AI/ML tools, including large language models (LLMs) in the EHR. Specifically, the ML/data engineer will work on implementing new data integrations, enhancing HIPAC's ETL functionalities, productionizing AI/ML tools developed by UCSF data scientists/researchers, and designing and implementing metrics to continuously monitor AI/ML tools deployed at UCSF Health., (To be completed by Supervisor)

10% Yes

Applies advanced software concepts to plan, design, develop, modify, debug, deploy and evaluate highly complex software for functional areas. Analyzes existing highly complex software or works to formulate logic and devises algorithms for new highly complex software systems. Performs highly complex data analysis and tests / debugs highly complex software, working directly with management. Initiates, analyzes, designs and applies highly complex data sources. Applies and enforces complex programming security practices.

10% Yes

Specifies, develops and executes complex test plans. Develops conversion and system implementation plans. Performs or directs highly complex data modeling, performance and integration testing and builds interfaces. Determines source code control techniques and configuration management design and changes.

5% Yes

Prepares and approves or obtains approval for system and programming documentation. Initiates and oversees changes in development, maintenance and system standards. Sets the technical requirements for complex software specifications.

5% Yes

Understands and applies industry practices, community standards and department policies and procedures in depth. May serve as technical lead for multiple software development projects of moderate to broad scope. May lead a team of software development professionals. Enforces project plans.

25% Yes

Build and maintain data integration (with SQL databases or APIs) and data processing and transformation pipelines to support the development and implementation of AI/ML tools

20% Yes Identify and build systems for implementation, monitoring, and maintenance of AI/ML tools. Collaborate with researchers and developers to productionize and maintain AI/ML tools in the Health System.

25% Yes

Collaborate with data scientists and researchers to design and implement highly complex metrics and processes to automatically monitor AI/ML tools for safety, potential bias or drift, performance, and validity.

100% (To update total %, enter the amount of time in whole numbers (without the % symbol - e.g., 15, 20) then highlight the total sum (e.g., 1%) at the bottom of the column and press F9. The total sum should add up to 100%.)

Requirements

Competitive applicants for this position are software, machine learning, or data engineers with 6+ years of experience in implementing and maintaining AI/ML pipelines. Proficiency in MLOps, Python, SQL, and CI/CD is required. This role also requires a deep understanding of Epic data models (Clarity and Caboodle). Successful candidates either have or are able to obtain Epic Clinical/Clarity data model certification shortly after onboarding., * Bachelor's degree in Computer Science, Computer Engineering, or related area and / or equivalent experience/training.

  • 6 years of experience in positions of increasing responsibility designing, implementing, and maintaining complex AI/ML applications.
  • Advanced experience with SQL (e.g., SQLServer, PostgreSQL)
  • Advanced experience in database systems, data warehousing solutions, and understanding of ETL pipelines
  • Advanced experience in designing, building, or maintaining data infrastructure for efficient ML model training and inference.
  • Experience with data analysis and machine learning tools such as Jupyter, Pandas, scikit-learn, Numpy/Scipy, PyTorch, etc.
  • Demonstrated advanced knowledge of full software development lifecycle
  • Advanced experience with Python; ability to write clean, efficient, and production-level Python code
  • Demonstrated experience deploying, monitoring, and maintaining AI/ML models and pipelines
  • Demonstrated experience working with MLOps, DevOps, and CI/CD pipeline toolsets
  • Experience in developing complex, automated testing
  • Experience with cloud-based architecture in platforms such as AWS, GCP, Azure
  • Demonstrated effective communication and interpersonal skills
  • Demonstrated ability to communicate technical information to technical and non-technical personnel at various levels in the organization
  • Self-motivated and works independently and as part of a team. Able to learn effectively and meet deadlines
  • Demonstrated broad problem-solving skills
  • Demonstrated ability to interface with management on a regular basis
  • Excellent project leadership and management skills., * Master's Degree or PhD in Computer Science, Computer Engineering, or related area and / or equivalent experience / training.
  • Epic Clarity Certification
  • Cloud Development certifications
  • Experience with large language models and other generative AI technologies, especially supporting the deployment of GenAI-based tools in a production environment
  • Familiar with data visualization tools (e.g., Tableau)
  • Experience with Epic data structures

About the company

The University of California, San Francisco (UCSF) Department of Information Technology Academic Research Systems (ARS) group is chartered to provide data services and infrastructure that support the UCSF Research Community's computing and analytic requirements through centralized informatics services in the areas of Data, Tools, Secure Compute Environments, and Consulting Services.

Apply for this position