Senior Machine Learning Engineer

Intellias
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

Tech stack

HTML
Artificial Intelligence
Cloud Computing
Python
Machine Learning
Open Source Technology
Markdown
PyTorch
Large Language Models
HuggingFace
Build Tools

Job description

We are seeking an experienced engineer to help us build high-precision solutions for PDF-to-Markdown and PDF-to-HTML conversion, particularly for complex documents with diverse layouts., * Research, evaluate, and fine-tune open-source OCR and document intelligence models for text and layout extraction from complex PDFs.

  • Develop end-to-end solutions for PDF-to-Markdown / PDF-to-HTML conversion, preserving text structure, formatting, and layout accuracy.
  • Build tools for data preprocessing, annotation, and quality evaluation of OCR outputs.
  • Implement post-processing techniques, text alignment, and metadata extraction to improve model precision.
  • Collaborate closely with research and engineering teams to integrate OCR pipelines into production-ready systems.
  • Stay current with advancements in document AI, multimodal learning, and OCR research.

Requirements

  • 5+ years of experience in Machine Learning, with at least 2 years focused on OCR, Document AI, or vision-language models.
  • Strong expertise in Python, PyTorch, and Hugging Face Transformers (training, fine-tuning, inference).
  • Solid understanding of ComputerVision and its implementation
  • Hands-on experience deploying LLM / VLM models on vLLM or similar high-performance inference frameworks.
  • Deep understanding of OCR pipelines, layout parsing, and document structure recognition (PDFs, scanned docs, tables, mixed content).
  • Familiarity with cloud infrastructure and GPU-based inference pipelines.
  • Research-oriented mindset with the ability to experiment, analyze results, and iterate quickly.
  • Excellent communication and documentation skills.

About the company

Let's breathe life into great tech ideas! With 3,000 people globally, Intellias is a company where benchmark technological solutions are born. Join in and take your part in digitalizing the world. We are exploring cutting-edge OCR and metadata extraction from PDF documents. OCR and document intelligence are rapidly evolving fields, with open-source models like DeepSeek OCR and LightOn OCR pushing the boundaries., At Intellias, where technology takes center stage, people always come before processes. We're dedicated to cultivating a tech-savvy environment that empowers individuals to unlock their true potential and achieve extraordinary results. Our customized benefits not only prioritize your well-being but also charge your professional growth, making this opportunity an ideal match for tech enthusiasts like you.

Apply for this position