Werkstudent: AI Research & Data Evaluation
Role details
Job location
Tech stack
Job description
You'll be stress-testing the world's most advanced models to see where they break. Your work will directly impact how frontier LLMs handle complex, multilingual tasks. Your work will be supervised by our in-house research staff.
- Evaluate & Benchmark: Run rigorous evaluations on frontier LLMs and autonomous agents across diverse tasks.
- Data Engineering: Create or modify benchmark data to test the reasoning and linguistic limits of modern AI.
- Experimental Research: Design and run experiments to identify "model-breaking points" and interpret the resulting data., * Brand-aware AI that learns your voice, tone, and terminology to ensure every translation is accurate and consistent
- Agentic AI workflows that automate the entire translation process from content ingestion to quality review to publishing
- 100+ native integrations with systems like Adobe Experience Manager, Webflow, Salesforce, GitHub, and Google Drive to simplify content translation
- Human-in-the-loop reviews via our global network of professional linguists, for high-impact content that requires expert review
LILT in the News
- Featured in The Software Report's Top 100 Software Companies!
- LILT makes it onto the Inc. 5000 List.
- LILT's continues to be an intellectual powerhouse, holding numerous patents that help power the most efficient and sophisticated AI and language models in the industry.
- Check out all our news on our website.
Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at https://lilt.com/legal/privacy.
At LILT, we are committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision-making and help us identify qualified candidates efficiently and objectively. All final hiring decisions are made by people. If you have any concerns, require accommodations, or would like to opt-out of the use of AI in our hiring process, please let us know at recruiting@lilt.com.
Requirements
Do you have experience in Salesforce?, * Currently enrolled at TU Berlin majoring in Computer Science (Bachelor/Master) or a related field
- Solid understanding of LLMs, natural language processing, or machine learning
- Highly proficient in Python, Bash, and git
- Appetite to quickly understand and incorporate new methodologies and models in a rapidly changing research landscape
- Strong drive to ship customer projects, sometimes on tight deadlines, to high quality
- Proficient in English
- Preferred: Proficient in one or more non-English languages
Benefits & conditions
- Work directly with models and teams from frontier labs like Google or Anthropic
- Opportunity to publish papers in top-tier AI/ML conferences
- Contribute to industry-standard open-source benchmarks
- Competitive salary
- Hybrid environment with an on-site research team