Magnet.me - The smart network where students and professionals find their internship or job.

The smart network where students and professionals find their internship or job.

Assignment: Research on the Use of Large Language Models for Pathology Reports

Internship • Zeist, NL

Posted 3 Dec 2024

Work experience

0 to 1 years

Full-time / part-time

Full-time

Job function

Research

Salary

€600 per month

Degree level

WO Master

Required languages

English (Good)

Dutch (Fluent)

Your career starts on Magnet.me

Create a profile and receive smart job recommendations based on your liked jobs.

Create profile

The rise of large language models (LLMs) has opened new possibilities for processing and analysing medical texts, including pathology reports. These reports often contain complex medical terminology and detailed information that is crucial for making diagnoses and developing treatment plans. This research will investigate the effectiveness of various large language models in processing pathology reports in Dutch, and compare these models with traditional techniques such as a keyword matching approach, to evaluate which method is best suited for extracting useful data from these reports.

Objective of the Assignment:

The aim of this assignment is to research which large language model performs best at extracting relevant information from Dutch pathology reports and how these models compare to a keyword matching approach. You will also need to analyse the capabilities and limitations of both approaches, with a focus on their ability to handle Dutch-language reports. Additionally, there may be a need to train or fine-tune the model to improve its accuracy with Dutch pathology data.

Assignment Description:

Literature Review
Conduct a thorough literature review on the use of large language models in the medical field, with a specific focus on pathology reports in Dutch. Describe the advantages and disadvantages of various LLMs that could be used in this context, and discuss their capability to handle medical terminology in the Dutch language.
Comparison of LLMs
Select at least three different large language models (such as GPT, BERT, BioBERT) and evaluate them based on their performance in extracting information from Dutch pathology reports. Consider the following factors:

Accuracy of extraction
Ability to correctly interpret Dutch medical terminology
Processing speed
Amount of training data required (especially for Dutch-language reports)
Whether the model needs to be trained or fine-tuned on Dutch pathology reports to improve performance

Training the Model (Possibly)
If deemed necessary, train or fine-tune one or more of the selected large language models specifically for Dutch pathology reports. This may involve the following steps:

Preprocessing of pathology report data
Fine-tuning the model with labelled examples (if available )
Evaluating the model’s performance post-training

4. Comparison

Compare the performance of the selected LLMs with a keyword matching approach. Perform tests where both methods are used to analyse the same Dutch pathology reports and evaluate:

The quality of extracted data
The number of errors or missing data
Ease of implementation

Performation

Bij Performation hebben we voor elke afdeling binnen de zorgorganisatie een oplossing, gebaseerd op één databron als stabiele kracht. Continu werken we aan verbeteringen om de besturing nog efficiënter te maken. Daarbij zijn innovatie, optimalisatie en doelmatigheid onze drijfveren. Wij optimaliseren de bedrijfsvoering van de zorg, zodat zij zich kunnen richten op patiëntenzorg.

Zeist

Active in 3 countries

160 employees

40% men - 60% women

Average age is 35 years

Change language to: Dutch

This page is optimised for people from the Netherlands. View the version optimised for people from the UK.