Magnet.me - The smart network where students and professionals find their internship or job.

The smart network where students and professionals find their internship or job.

Master Thesis: Building an Uncertainty-Robust Reinforcement Learning-based model for UAV self-separation under Uncertainty

Graduate Internship • Amsterdam, NL

Posted 13 Mar 2026

Work experience

0 to 2 years

Full-time / part-time

Full-time

Job function

Research

Degree level

HBO or WO master

Required language

English (Fluent)

Build your career on Magnet.me

Create a profile and receive smart job recommendations based on your liked jobs.

Create profile

This master thesis assignment focuses on building an uncertainty-robust Reinforcement Learning (RL)-based model for UAV self-separation under uncertainty.

Job description

Background

The autonomous operation of unmanned aerial vehicles (UAVs) plays an increasingly important role in research and commercial applications. These vehicles can assist with crucial applications, such as emergency response, infrastructure monitoring, and parcel delivery, but are expected to lead to traffic densities too great for human air traffic controllers to handle. Work its ongoing to develop autonomous separation management systems, from planning and trajectory generation to conflict detection and resolution. For conflict detection and resolution (CD&R), Reinforcement Learning (RL) shows great promise, outperforming state-of-the art geometric methods in safety and efficiency under certain conditions. These methods can be shown to be robust to position noise, and especially perform better at high traffic densities. However, most work considers a homogeneous policy: that is, all vehicles employ the same self-separation strategy, which is also the basis for the strong performance shown by the RL models. In realistic operations, low-level airspace is heterogeneous, and will include vehicles such as trauma response helicopters. These trauma helicopters showcase different dynamics as they travel through the airspace faster than a typical drone, and are given priority over drone operations, meaning that they themselves may not take any conflict resolution manoeuvres. As this is a largely unexplored topic, several research questions can be derived from this, namely:

How do Learning-based autonomous CD&R methods perform in heterogeneous environments, with unresponsive priority vehicles such as trauma helicopters?
How can the training regimes of the models take priority vehicles into account while guaranteeing safety?

The thesis will be expected to answer these questions.

The internship is in collaboration with the TU Delft

Tasks

The assignment will include the following tasks:

Investigation of existing approaches for (RL-based) CD&R, including under uncertainty (Literature Study);
Design of representative heterogeneous scenarios for evaluation and training;
Model selection, tuning or development, based on simulation results (with algorithms such as SAC from stable-baselines3 or other);
A design benchmark for the analysis of system safety and robustness under heterogeneous and homogeneous scenarios.

Results

The final outcome of this assignment will be:

Research into a priority-aware RL-based UAV conflict resolution model;
A technical thesis report describing the approach, results and conclusions of the work;
Optional: a conference paper.

Duration

6 months.

What do we expect from you

Master student aerospace engineering, mechanical engineering, control engineering or computer science;
Experience with programming (Python, Matlab);
Experience with practical application of ML/RL (PyTorch, Keras, Tensorflow or other);
Preferably good understanding of (aircraft) dynamics, simulation & control.

What we offer

Enthusiastic colleagues who are experts in their field;
A flexible working space;
An environment where you have the opportunity to develop your skills and learn new ones;
A challenging assignment in a high-tech, result orientated work environment;
A thesis assignment allowance;
An informal corporate culture where your opinion counts!

About NLR

You will be working within the Air Traffic Management & Airport department. Your colleagues are focused on solving real-world problems within air traffic management, airspace design, U-Space and other exciting domains.

Contact

For more information about the assignment contact Sasha Vlaskin sasha.vlaskin@nlr.nl.

NLR

NLR’s multidisciplinary approach focuses on developing new and cost effective technologies for aviation and space, from design support to production technology and MRO (Maintenance, Repair and Overhaul). With its unique expertise and state of the art facilities NLR is bridging the gap between research and application.

NLR covers the whole RDT&E (Research, Development, Test & Evaluation) range, including all the essential phases in research, from validation, verification and qualification to evaluation. By doing so, NLR contributes to the innovative and competitive strength of government and industry, in the Netherlands and abroad.

NLR employs a staff of approx. 600 at our offices in Amsterdam, Marknesse and Schiphol. The company realizes an annual turnover of approx. 76 million euro.

Aerospace & Defence

Amsterdam

600 employees

Change language to: Dutch

This page is optimised for people from the Netherlands. View the version optimised for people from the UK.