Machine Learning Research Engineer in Natural Language Processing and Media Mining

Swiss Federal Institute Of Technology Lausanne Epfl
Lausanne, Switzerland
4 days ago

Role details

Contract type
Temporary contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English, French, German
Experience level
Junior

Job location

Lausanne, Switzerland

Tech stack

3d Models
API
Artificial Intelligence
Amazon Web Services (AWS)
Computer Vision
Unix
Cloud Storage
Databases
Programming Tools
Github
Information Extraction
Python
Machine Learning
Natural Language Processing
NoSQL
SQL Databases
Text Mining
Web Applications
Deep Learning
Kubernetes
Information Technology
HuggingFace

Job description

Join the Swiss Federal Institute of Technology Lausanne (EPFL) as a Machine Learning Research Engineer. Work on exciting projects in a dynamic environment., * Apply and adapt existing NLP and computer vision models to large-scale, multilingual historical text and image data.

  • Fine-tune or design models for additional text mining tasks, in particular media section classification.
  • Support the creation of ground truth data by adapting the setup of web-based annotation tools, and assist in the management of annotation campaigns and data releases.
  • Contribute to the maintenance and adaptation of web-serving setups for annotation models (TorchServe).
  • Support the consolidation, validation, and documentation of existing data, pipeline components, and code modules.

Additional activities (optional / depending on profile)

  • Collaborate on the design of Impresso WebApp, Datalab and API
  • Participate in the development and adoption of standards for the representation and exchange of historical data (raw material and annotations)
  • Contribute to scientific publications and project workshops on media mining, semantic indexing, and sustainability, * Contact : for any questions please contact Marina Buyrskaya Moyer (marina.butyrskayamoyer[ at]epfl.ch) and Maud Ehrmann (maud.ehrmann[ at]epfl.ch)

Requirements

  • MSc or PhD in NLP, Computer Science, or related field required.
  • Proficiency in Python and machine learning techniques essential.
  • Experience with collaborative development tools like GitHub preferred., * Experience: 1-3 years as a machine learning engineer or NLP researcher/ programmer
  • Education: MSc or PhD in NLP, Computer Science, Data Science, or a related field, or equivalent professional experience in machine learning/NLP
  • Technical skills:
  • Solid expertise in machine learning, with practical experience in deep learning architectures (transformers, language models) and information extraction tasks
  • Proficiency in Python, Unix-based systems, databases (SQL/NoSQL), cloud storage and computing (S3, Kubernetes, Run:AI), and scripting/automation
  • Familiarity with collaborative development and code/model management platforms (GitHub, Hugging Face, and related tools)
  • Mindset: Curious, creative, rigorous, and attentive to detail; motivated by scientific research and cultural heritage applications, with a proactive and problem-solving attitude
  • Strong sense of teamwork, communication, accountability, and production readiness
  • Very good command of written and spoken English

Desirable skills

  • Prior experience in an academic or research context
  • Experience with historical or digitized documents and interdisciplinary collaboration
  • Experience with image processing alongside text and language data is a plus
  • Interest in student supervision and academic publication
  • Knowledge of French or German

About the company

Impresso is an interdisciplinary research project that brings together computational linguists, computer scientists, digital humanists, historians, and designers from EPFL, the University of Zurich, the University of Lausanne, and the C²DH (Luxembourg), along with over 20 European partners . Funded by the Swiss National Science Foundation and the Luxembourg National Research Fund (2023-2027), the project pioneers new methods for exploring digitized newspaper and radio archives across languages, media, and borders through semantic enrichments and shared multilingual vector spaces

Apply for this position