Machine Learning Ops Engineer
Role details
Job location
Tech stack
Job description
ML & LLM Engineering, Search and Recommendation Engines
- Automate and orchestrate machine learning workflows across major cloud and AI platforms (AWS, Azure, Databricks, and foundation model APIs such as OpenAI).
- Maintain and version model registries and artifact stores to ensure reproducibility and governance.
- Develop and manage CI/CD for ML, including automated data validation, model testing, and deployment.
- Implement ML Engineering solutions using popular MLOps platforms such as AWS SageMaker, MLflow, Azure ML.
- Scale end-end custom Sagemaker pipelines.
- Design and implement the engineering components of GAR+RAG systems (e.g., query interpretation and reflection, chunking, embeddings, hybrid retrieval, semantic search), manage prompt libraries, guardrails and structured output for LLMs hosted on Bedrock/SageMaker or self-hosted.
- Design and implement ML pipelines that utilize Elasticsearch/OpenSearch/Solr, vector DBs, and graph DBs .
- Build evaluation pipelines: offline IR metrics (NDCG, MAP, MRR), LLM quality metrics (faithfulness, grounding), and A/B testing.
- Optimize infrastructure costs through monitoring, scaling strategies, and efficient resource utilization.
- Stay current with the latest GAI research, NLP and RAG and apply the state-of-the-art in our experiments and systems.
Collaboration
- Partner with Subject-Matter Experts, Product Managers, Data Scientists and Responsible AI experts to translate business problems into cutting edge data science solutions
- Collaborate and interface with Operations Engineers who deploy and run production infrastructure.
Requirements
Are you a collaborative Machine Learning Ops Engineer looking to work for a mission driven global organization?
Are you looking to drive cutting edge products that have a true societal impact?
About the team, this team that powers Elsevier's Health platforms: Clinical Key AI, Sherpath AI, and AI-driven automated clinical and content workflows. You will bridge Data Science and Engineering to turn experimental NLP/IR/GenAI models into secure, reliable, and scalable services. Our systems operate over one of the world's largest medical and scholarly landscapes.
About the role, as a Senior Machine Learning Engineer you'll work on AI-based features (GenAI, Agentic AI, RAG, etc.) search/ranking quality, and knowledge graph aware retrieval while enforcing content rights and editorial confidentiality., * Current experience in ML Engineering, MLOps platforms, shipping ML or search/GenAI systems to production.
- Strong Python, Java, and/or Scala experience will be considered a plus.
- Hands-on- experience with major cloud vendor solutions (AWS, Azure and/or Google)
- Experience with Search/vector/graph technologies (e.g., Elasticsearch / OpenSearch / Solr / Neo4j).
- Experience in evaluating LLM models.
- A strong understanding of the Data Science Life Cycle including feature engineering, model training, and evaluation metrics.
- Background in health technology and/or medical content workflows is preferred.
- Familiarity with ML frameworks, e.g., PyTorch, TensorFlow, PySpark.
- Experience with large-scale data processing systems, e.g., Spark.
- Experience with statistical analysis, machine learning theory and natural language processing.