Machine Learning Engineer (LLM Architect - Target ID) (UAE)

Insilico Medicine
9 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Remote

Tech stack

Artificial Intelligence
Artificial Neural Networks
Encodings
Computational Biology
Information Theory
Python
Machine Learning
Reinforcement Learning
PyTorch
Large Language Models
Deep Learning
Generative AI
Information Technology
Optimization Algorithms

Job description

We are currently seeking an exceptional and mathematically grounded Machine Learning Engineer to join our Target ID team. Your primary focus will be the architectural design and training of next-generation Large Language Models (LLMs) specifically tailored for biological discovery. Unlike standard implementations, you will be expected to propose and engineer novel architectural components and optimization strategies based on deep mathematical principles. You will bridge the gap between theoretical deep learning and practical biotechnology, creating models capable of reasoning over complex biological data to identify novel therapeutic targets. A strong background in mathematics and the ability to write custom model implementations from scratch are essential for this role., * Design, propose, and implement novel neural network architectures beyond standard Transformers (e.g., modifying attention mechanisms or positional encodings).

  • Derive and implement custom loss functions and optimization algorithms based on mathematical first principles to improve model convergence on biological data.
  • Lead the end-to-end training lifecycle of domain-specific LLMs, including Pre-training, Supervised Fine-Tuning (SFT), and Reinforcement Learning (RLHF).
  • Collaborate with the Target ID team to translate complex biological questions into precise mathematical formulations solvable by AI.
  • Develop strategies to ground LLM generation in factual data, mitigating hallucinations in high-stakes drug discovery contexts.
  • Optimize training pipelines for high-performance computing clusters using distributed training techniques.
  • Stay up-to-date with the latest advancements in Generative AI, Information Theory, and Computational Biology.
  • Work collaboratively with biologists and bioinformatics specialists to validate model hypotheses.

Requirements

Do you have experience in Python?, Do you have a Master's degree?, You must be based in Abu Dhabi for this role. If you're currently in Dubai, you must be willing to relocate, as commuting between the two cities won't be feasible for this position, Master's or Ph.D. degree in Mathematics, Computer Science, Physics, Machine Learning, or a related quantitative field., * 3-4 years of experience in Machine Learning;

  • Deep theoretical understanding of Linear Algebra, Calculus, Probability Theory, and Information Theory.
  • Expert proficiency in Python and deep learning frameworks (PyTorch, Transformers), with the ability to implement custom layers and training loops from scratch.
  • Proven experience in training and architecting Transformer-based models and LLMs.
  • Ability to read and implement methods from the latest AI research papers.
  • Familiarity with biological data types (sequences, structures, pathways) is a significant advantage, but not mandatory if the mathematical foundation is strong.
  • Strong problem-solving skills and a passion for algorithmic innovation.
  • Excellent written and oral communication skills for explaining complex mathematical concepts to cross-functional teams.

Apply for this position