From Text to Knowledge: Leveraging LLMs and RAG for Relationship Extraction in Ontologies and Thesauri
| Authors |
|
|---|---|
| Publication date | 2025 |
| Host editors |
|
| Book title | Joint Proceedings of Posters, Demos, Workshops, and Tutorials of the 24th International Conference on Knowledge Engineering and Knowledge Management (EKAW-PDWT 2024) |
| Book subtitle | co-located with 24th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2024) : Amsterdam, Netherlands, November 26-28, 2024 |
| Series | CEUR Workshop Proceedings |
| Event | Posters, Demos, Workshops, and Tutorials of the 24th International Conference on Knowledge Engineering and Knowledge Management |
| Number of pages | 16 |
| Publisher | Aachen: CEUR-WS |
| Organisations |
|
| Abstract |
Ontologies, vocabularies, and thesauri provide a shared conceptualisation for a domain. Manually maintaining and updating such knowledge systems when knowledge changes, does not scale for large domains, such as in biomedicine. Recently, large language models (LLMs) have been increasingly used as tools in knowledge engineering processes, offering new possibilities for the automatic creation and maintenance of knowledge systems. This work explores how LLMs can be leveraged for the automated extension of such knowledge systems.
Specifically, we build on the DRAGON-AI framework, which integrates Retrieval-Augmented Generation (RAG) to provide LLMs with access to external knowledge sources for more faithful outputs. We investigate the ability of the framework to predict relationships between a given knowledge system and a novel concept. We do so for both an ontology and a thesaurus, and analyse the impact of (i) enriching prompts with contextual information as well as more clear instructions, (ii) an alternative retrieval approach, and (iii) using a conversational model versus an instruction-following model. The results show superior quality in the ontology generations for all models and approaches compared to the thesaurus. The two models show varied performance across the different experiment configurations with only the conversational model showing notably improved performance, in terms of F1, for the ontology with the custom retrieval approach. |
| Document type | Conference contribution |
| Language | English |
| Published at | https://ceur-ws.org/Vol-3967/ELMKE_2024_paper_4.pdf |
| Other links | https://ceur-ws.org/Vol-3967/ |
| Downloads |
ELMKE_2024_paper_4
(Final published version)
|
| Permalink to this page | |