Magnet.me - Het slimme netwerk waar studenten en professionals hun stage of baan vinden.

Het slimme netwerk waar studenten en professionals hun stage of baan vinden.

Inloggen Registreren

PhD Position AI Alignment: Value Assessment for Open Models & AI Systems

Placement • Delft, NL

Geplaatst 30 jun. 2026

Werkervaring

0 tot 2 jaar

Full-time / part-time

Full-time

Functie

Onderzoek

Salaris

€ 3.059 - € 3.881 per maand

Opleidingsniveau

WO master

Taalvereisten

Engels (Vloeiend)

Nederlands (Vloeiend)

Deadline

10 augustus 2026

Bouw aan je carrière op Magnet.me

Maak een profiel aan en ontvang slimme aanbevelingen op basis van je gelikete vacatures.

Profiel aanmaken

At TU Delft, you can contribute to research that makes a real-world difference through a PhD position focused on AI alignment for open models and AI systems.

Challenge: Developing, operationalizing, quantifying, and embedding complex human and legal values into alignment pipelines for AI systems, open-weights, and foundation models.

Change: Advancing from static, generic benchmarks to dynamic, automated validation and red-teaming frameworks tailored for high-risk deployments.

Impact: Enhancing police trustworthiness through AI alignment at the Netherlands Police and ensuring compliance with the EU AI Act by engineering measurably aligned AI systems.

Job description

As a PhD student at TU Delft, you will conduct impactful research on advancing the responsible use of AI within the Netherlands Police force, with a focus on aligning AI systems with human intentions, ethical standards, and legal frameworks.

AI alignment refers to the goal of making AI systems behave in line with human intentions and values. AI alignment ensures that advanced AI systems operate safely and strictly within the bounds of human intentions, ethical standards, and prevalent legal frameworks. With the rapid proliferation of AI systems, frontier LLMs, multimodal models, autonomous agents, open-weights, and foundation models, there are increasing risks of misalignment with human, organizational, and societal values through behavioral drift, hallucination, and adversarial exploitation. Validating models is crucial before implementation decisions are made, for continuous monitoring of systems in use, and for facilitating effective human oversight of AI. This is particularly important in high-stakes environments like law enforcement.

The main challenge is that validation needs to happen simultaneously across a range of values important in a law enforcement context: accuracy, fairness, reliability, trustworthiness, and more. A central research question is how to translate abstract democratic, organisational, and societal values such as algorithmic fairness, transparency, and explainability (XAI) into rigorous, quantifiable engineering metrics without sacrificing the general utility of models and AI systems.

Your research will focus on two key aspects in advancing the responsible use of AI within the Netherlands Police force. First, you will investigate the standards and values surrounding AI usage, particularly in the context of publicly available models, and define what criteria these models must meet beyond common considerations like bias and fairness. Second, you will design methods to systematically evaluate various models against these established standards and values. This contributes to the responsible deployment of AI within policing in the Netherlands, pushes forward our understanding of how to align AI models in practice, and maximizes the efficiency of utilizing publicly available models.

Formalizing Value Taxonomies and Alignment Metrics

You will investigate the ethical, legal, and operational guardrails required for deploying open-weights foundation models in sensitive public-facing domains. Moving beyond superficial bias benchmarks, you will conduct deep-dive case studies within the Netherlands Police to map operational requirements to formal alignment criteria. You will define what constructs such as trustworthiness and fairness mean mathematically and procedurally when applied to complex law enforcement workflows.

Engineering (Automated) and Human-in-the-Loop Evaluation and Red-Teaming Pipelines

You will design and implement scalable methodologies to systematically stress-test, audit, and benchmark AI models against your established criteria. This includes exploring red-teaming methods, synthetic data generation for vulnerability probing, and investigating how downstream alignment techniques such as DPO, RLHF, or constitutional AI can be customized to enforce strict adherence to organizational values.

Your project is part of the Model-Driven Decisions Lab, a Netherlands Police - TU Delft initiative, where you will join an interdisciplinary community of four fellow PhD students who have already been hired. Together, you will share knowledge to tackle AI-assisted decision-making from different perspectives. To foster close collaboration with stakeholders and support practical implementation, you will spend 20% of your time at the Netherlands Police’s strategy and innovation division.

Given the ethical and moral facets of your research, you will also work closely with colleagues of the Delft Digital Ethics Centre at the Faculty of Technology, Policy, and Management (TPM). Your home base will be the Web Information Systems research group at the Computer Science faculty (EEMCS). As an internationally diverse team of driven academics and students, we cultivate a welcoming and collaborative environment and will give you all the support and training you need to evolve both personally and professionally.

Job requirements

You hold an MSc in computer science, data science, or another relevant subject such as ethics of AI, with practical machine learning/artificial intelligence courses and relevant project and thesis experience.
You have a keen interest in AI alignment, human-AI interaction, and explainable AI, and enjoy collaborating with experts in different disciplines.
You thrive on conducting research geared to real-world applications in the security domain and are intrinsically motivated to collaborate with the Netherlands Police.
You harness strong communication skills to work with scientific and non-scientific stakeholders across different work cultures.
You have a good command of written and spoken English, as you will be working in an international environment.
You also have a good command of the Dutch language. This is a strong requirement due to the context of the project, which requires interaction with stakeholders and data in Dutch.

TU Delft (Delft University of Technology)

Working at TU Delft means contributing to solutions that really make a difference.

At TU Delft, our people make the difference. With their knowledge and curiosity, our staff provide high-quality education and conduct pioneering research that extends beyond the campus. You will have the opportunity to take the initiative, work with others, and grow as a professional.

Working at TU Delft means joining an international community of professionals and students. Together, we create knowledge, innovations, and solutions that help move the world forward.

Conditions of employment

Pending the screening result, a temporary employment contract as a researcher can be offered for up to 4 months, if requested by the candidate. This contract will be converted to a PhD contract upon a positive screening result.

These are 5-year PhD positions, with the extra fifth year allowing for additional activities related to learning about the police organization and securing the results within the police organization. Doctoral candidates will be offered 5 years of employment in principle in the form of two employment contracts:

An initial 1.5-year contract with an official go/no-go progress assessment within 15 months.
An additional contract for the remaining 3.5 years, assuming performance requirements are met.

As a PhD candidate, you will be enrolled in the TU Delft Graduate School. The TU Delft Graduate School provides an inspiring research environment with an excellent team of supervisors, academic staff, and a mentor. The Doctoral Education Programme is aimed at developing your transferable, discipline-related, and research skills. TU Delft offers a customizable compensation package, discounts on health insurance, a monthly work costs contribution, and flexible work schedules can be arranged.

As you will be working in the security domain, you must undergo a security screening executed by the Dutch government before starting this position. This screening takes on average 2 to 3 months and could take up to 6 months. A positive outcome is a prerequisite for the contract for these PhD positions to come into effect. At least a BO screening is needed for these PhD positions.

Delft University of Technology

De fascinatie voor science, design en engineering is wat ruim 13000 bachelor & masterstudenten en 5000 medewerkers van de TU Delft drijft. De Technische Universiteit Delft is niet alleen de oudste, maar ook de grootste technische universiteit van Nederland: een universiteit die continu op zoek is naar jou als (inter)nationaal talent om het onderzoek en onderwijs van deze unieke instelling op topniveau te houden. Met ongeveer 5.000 medewerkers is de Technische Universiteit Delft de grootste werkgever in Delft. De acht faculteiten, de unieke laboratoria, onderzoeksinstituten, onderzoeksscholen en de ondersteunende universiteitsdienst bieden de meest uiteenlopende functies en werkplekken aan. De diversiteit bij de TU Delft biedt voor iedereen mogelijkheden. Van Hoogleraar tot Promovendus. Van Beleidsmedewerker tot ICT'er.

Engineering

Delft

5.000 medewerkers

Change language to: English

Deze pagina is geoptimaliseerd voor mensen uit Nederland. Bekijk de versie geoptimaliseerd voor mensen uit het Verenigd Koninkrijk.