AI for Education Platform Specialist & Benchmark Engineer
Role details
Job location
Tech stack
Job description
We are seeking an AI for Education Platform Coordinator who will serve as both a power-user and benchmark guardian of our AI infrastructure. This is a hands-on technical role focused on making our platform accessible to new users while ensuring our AI systems meet the highest standards across technical, ethical and ecological dimensions. A key aspect of this role involves close collaboration with EPFL's AI Center to bridge research and practice.
Our infrastructure includes:
- Graph AI Platform: Knowledge graph, semantic search, and RAG construction pipelines
- RCP Computational Infrastructure: GPU clusters, local LLMs, specialized AI services
- Intelligent Agents Framework: Orchestration layer for chatbots and AI applications
- Multiple Data Sources: Educational content, institutional data, research publications
Mission
You will bridge the gap between our sophisticated AI infrastructure and the diverse teams that can benefit from it. As a power-user, you'll master our tools, document them thoroughly, and train others. As an evaluator, you'll establish rigorous testing frameworks to ensure our AI systems are responsible, efficient, cost-effective, and sustainable. You will also serve as a key liaison with EPFL's AI Center, facilitating the integration of cutting-edge research findings into our educational AI platform and enabling research projects that leverage our infrastructure and data.
Main duties and responsibilities
- Comprehensive AI Benchmarking (60%)
Develop and maintain a rigorous benchmark evaluation framework:
Ethical dimension
- Streamline bias detection tests (gender, language, cultural biases)
- Create transparency documentation: model capabilities, limitations, training data, appropriate use cases
- Provide sustainability recommendations for model selection and green AI practices
Technical Dimension
- Apply performance benchmarking harness (latency, throughput, accuracy)
- Create comparative evaluation matrix: Local LLMs (RCP) vs. Apertus vs. Commercial service.
- Develop domain-specific and educational test sets to evaluate LLMs
AI Center collaboration
- Facilitate the integration of research findings from EPFL's AI labs into our educational platform
- Collaborate with the AI Center to benchmark and integrate Apertus, their open weight model, into our platform stack
- Support research projects that can benefit from our infrastructure or educational data collection capabilities
- Infrastructure Enablement & Platform Adoption (40%)
Make our AI platform accessible to others and scaling up:
Documentation & Knowledge Sharing
-
Write comprehensive technical documentation (user guides, API docs, tutorials)
-
Create templates and boilerplates for common scenarios
-
Develop configuration wizards to simplify complex setups
Platform Advocacy
- Build relationships with potential users across EPFL and partner institutions
- Identify and test new application domains beyond education
- Foster connections between the educational AI platform and research teams at the AI Center
Requirements
- Master's degree in Computer Science, Data Science, AI/ML, or related field
- 2-3 years of professional experience working with AI/ML systems
- Proven experience with Large Language Models, RAG systems, or similar AI technologies
- Background in software testing or evaluation methodologies
Technical Skills
- Proficient in Python programming
- Experience with ML frameworks and libraries (LangChain, transformers, vector databases)
- Understanding of cloud/on-premise infrastructure (GPU clusters, containerization)
- Knowledge of evaluation metrics and benchmarking methodologies
Desired Qualifications
- Background in AI ethics, responsible AI, or fairness in ML
- Background in learning sciences, educational psychology, or pedagogy
- Experience with open-source communities and documentation