Staff Engineer, AI and Data Science
Role details
Job location
Tech stack
Job description
You will design, implement, and operationalize AI and DS models for upstream (cell-culture/bioreactor), downstream (purification) operations, Formulation Development and multiple Analytics teams while partnering closely with process-development, manufacturing-sciences, and digital teams. You will turn data into prescriptive guidance, deploy production-grade models, and build innovative AI solutions that enhance process understanding, optimization, and automation.
A Typical Day in the Role of Staff Engineer Might Look Like:
- Build and deploy AI/ML-powered solutions to accelerate our digitalization journey.
- Advance PAPD's broader AI, DS and related digital-maturity initiatives.
- Collaborate with process engineers, citizen data scientists, IT, and manufacturing colleagues to coordinate AI and Advanced modeling efforts enterprise wide.
- Explore, prototype and implement GenAI approaches and solutions (e.g., Retrieval-Augmented Generation) to enhance knowledge management, and decision support.
- Develop, validate, and maintain mechanistic, hybrid, and data-driven models for cell culture, purification, formulation and other processes.
- Translate complex bioprocess questions into quantitative modeling strategies that inform scale-up, tech transfer, and continuous improvement.
- Mentor citizen data scientists and champion best practices in model development, method selection, and code quality.
Requirements
- Analytical rigor and creative problem solving
- Ability to drive projects autonomously while thriving in cross-functional teams
- Excellent written and verbal communication
- Passion for innovation and continuous learning
This role requires a Ph.D. in Chemical/Biochemical Engineering, Biotechnology, Applied Mathematics, Computer Science or related field with 2+ years of industrial experience OR- Master's with 5+ years. Mechanistic understanding of upstream and/or downstream bioprocess unit operations, scale-up/down principles, and critical quality attributes is required. A demonstrated success modeling bioprocesses via first-principles, hybrid, or data-driven (ML) methods is preferred.
A strong foundation in AI/ML algorithms (regression, classification, Bayesian methods, deep learning, time-series, probabilistic modeling) is a plus, along with expertise in multivariate statistics for process modeling, real-time monitoring, and control. Expert programming proficiency in Python and SQL and experience with statistical/computational tools such as JMP, SIMCA, MATLAB is helpful. Proven ability to communicate technical concepts to multidisciplinary stakeholders a must. Experience with GenAI stacks (LLMs, vector databases, RAG pipelines) and multimodal techniques is necessary.
Preferred Qualifications
- Hands-on experience with cloud analytics platforms (e.g., Dataiku, Databricks).
- Strong working knowledge of Quality-by-Design (QbD) principles and statistically rigorous Design-of-Experiments (DoE) for defining design space, optimizing critical process parameters, and informing robust control strategies.
- Familiarity with PAT and chemometric modeling (e.g., Raman spectroscopy) for bioprocess monitoring and control.
- Understanding of operation research techniques such as combinatorial optimization, linear programming, mixed integer programming is a plus.
- Strong publication record in bioprocess modeling or AI for biomanufacturing.