NLP/ML Engineer (Canada/UAE)

Insilico Medicine

9 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Intermediate

Job location

Remote

Tech stack

Artificial Intelligence

Information Extraction

Python

Machine Learning

Natural Language Processing

Reinforcement Learning

Large Language Models

Deep Learning

Information Technology

Document Classification

GPT

Job description

Insilico Medicine is looking for a Machine Learning Engineer specializing in Natural Language Processing (NLP) tasks within the biomedical and materials science domains. The role focuses on areas such as text classification, information extraction from abstracts, patents, and clinical trials, multi-task learning, knowledge graph construction, and fine-tuning large language models (LLMs) for chemical and biomedical applications., * Fine-tune and optimize Large Language Models on domain-specific or custom datasets.

Analyze errors, identify system limitations, and propose enhancements.
Search and review state-of-the-art solutions and new datasets for NLP tasks.
Design scalable and maintainable engineering solutions inspired by the latest research and innovation.
Translate academic innovation into scalable, maintainable engineering solutions.
Build and curate datasets using annotation tools, distant supervision, and expert annotations.
Collaborate closely with clients and internal stakeholders to align research-driven initiatives with business needs.

Requirements

Do you have experience in Python?, Do you have a Master's degree?, Master's degree or PhD in Computer Science, Machine Learning, or a related field., * 3+ years of hands-on experience in NLP, Machine Learning, and Deep Learning.

Strong understanding of Machine Learning, Deep Learning and AI.
Strong proficiency in Python programming.
Motivation to learn new things and apply creative solutions.
Hands-on experience in scaling and optimizing large language model (LLM) training and fine-tuning, including multi-GPU/multi-node setups.
Familiarity with frameworks like DeepSpeed, FSDP, Megatron-LM, or equivalent.
Ability to diagnose and resolve performance bottlenecks in distributed training.
Experience fine-tuning LLMs (e.g. GPT, LLaMA, Mistral) on custom or domain-specific datasets.

Desirable skills:

Knowledge of chemistry and biology, particularly for domain-specific NLP applications in life sciences.
Familiarity with Reinforcement Learning concepts and frameworks.

Personal Attributes: