Michael Niebisch

Leveraging Large Language Models for Legacy Code Translation: Challenges and Solutions

What happens when an LLM translates MATLAB to Python? We built an agent that automatically runs unit tests and tells the AI how to fix its own mistakes.

Leveraging Large Language Models for Legacy Code Translation: Challenges and Solutions
#1about 5 minutes

Motivations for translating legacy MATLAB code to Python

The project aimed to explore LLMs for modernizing a large, legacy MATLAB codebase due to the scarcity of MATLAB developers and the rise of Python.

#2about 4 minutes

Using a semi-automatic workflow with ChatGPT for translation

The initial approach involved a manual copy-paste workflow using the ChatGPT web interface, which saved time on boilerplate but struggled with large code chunks and introduced errors.

#3about 4 minutes

Overcoming language-specific challenges in code translation

Key translation challenges arose from fundamental differences between MATLAB and Python, such as array indexing and memory layout, requiring a divide-and-conquer approach and robust unit tests.

#4about 5 minutes

Developing an automated pipeline for translation and auto-fixing

To improve efficiency, an automated pipeline was built to first annotate code with type and shape information before translation and then use an agent-based tool to automatically fix bugs based on test failures.

#5about 4 minutes

Evaluating LLM performance and providing debugging support

A framework was developed to evaluate translation quality by testing against known failure cases, and a debugging tool uses LLMs to compare execution logs from both languages to pinpoint errors.

#6about 3 minutes

Considering local LLMs for security and summarizing key learnings

Due to IP and security concerns with cloud APIs, local models like Llama 2 were explored, and the project concluded that while LLMs are promising tools, fully automated, error-free translation remains a significant challenge.

Related jobs
Jobs that call for the skills explored in this talk.

Featured Partners

From learning to earning

Jobs that call for the skills explored in this talk.

Machine Learning Engineer

Machine Learning Engineer

Picnic Technologies B.V.
Amsterdam, Netherlands

Intermediate
Senior
Python
Machine Learning
Structured Query Language (SQL)