Transformers for search Retrieval, robustness, and refusal
| Authors | |
|---|---|
| Supervisors | |
| Cosupervisors | |
| Award date | 13-02-2026 |
| ISBN |
|
| Number of pages | 102 |
| Organisations |
|
| Abstract |
Access to information has never been easier, quicker, nor more fragile. As language models increasingly dominate search and question answering, the boundary between retrieving information and generating it has blurred. Systems designed to find relevant documents now routinely produce complete answers and explanations, stitched together from model memory and retrieved evidence. While retrieval-augmented models make it easy to answer complex questions, this convenience often hides a fragility: these systems work best when test conditions resemble their training data, and often fail when they do not.
Modern information retrieval systems typically follow a pipeline architecture, where a retriever selects candidate documents and a generator produces an answer conditioned on them. In such systems, retrieval and generation are tightly coupled. Two requirements are critical for reliable performance: generalizability, where retrievers remain effective across new datasets, domains, and languages; and grounded answers, where generators base outputs on retrieved evidence and abstain from answering when that evidence is missing. This thesis studies these requirements together. It examines how training data augmentation and negative sampling shape dense retrievers under distribution shift, proposes methods that improve robustness across domains and languages, and investigates how small, open-source language models can be trained to reason over retrieved evidence and refuse to answer when evidence is insufficient. Finally, it emphasizes accessibility through an open-source library, Simple Transformers, that lowers the barrier to building and reproducing transformer-based retrieval and question answering systems. |
| Document type | PhD thesis |
| Language | English |
| Downloads | |
| Permalink to this page | |
