The CASTLE 2024 Dataset: Advancing the Art of Multimodal Understanding

doi:https://doi.org/10.1145/3746027.3758199

The CASTLE 2024 Dataset: Advancing the Art of Multimodal Understanding

Authors	Luca Rossetto Werner Bailer Duc-Tien Dang-Nguyen Graham Healy Björn Þór Jónsson Onanong Kongmeesub Hoang-Bao Le Stevan Rudinac Klaus Schöffmann Florian Spiess Allie Tran Minh-Triet Tran Quang-Linh Tran Cathal Gurrin
Publication date	2025
Book title	MM '25
Book subtitle	Proceedings of the 33rd ACM International Conference on Multimedia : October 27-31, 2025, Dublin Ireland
ISBN (electronic)	9798400720352
Event	33rd ACM International Conference on Multimedia
Pages (from-to)	12629–12635
Number of pages	7
Publisher	New York, NY: Association for Computing Machinery
Organisations	Faculty of Economics and Business (FEB) - Amsterdam Business School Research Institute (ABS-RI)
Abstract	Egocentric video has seen increased interest in recent years, as it is used in a range of areas. However, most existing datasets are limited to a single perspective. In this paper, we present the CASTLE 2024 dataset, a multimodal collection containing ego- and exo-centric (i.e., first- and third-person perspective) video and audio from 15 time-aligned sources, as well as other sensor streams and auxiliary data. The dataset was recorded by volunteer participants over four days in a common location and includes the point of view of 10 participants, with an additional 5 fixed cameras providing an exocentric perspective. The entire dataset contains over 600 hours of UHD video recorded at 50 frames per second. In contrast to other datasets, CASTLE 2024 does not contain any partial censoring, such as blurred faces or distorted audio. The dataset is available via https://castle-dataset.github.io/.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1145/3746027.3758199
Other links	https://castle-dataset.github.io/
Downloads	3746027.3758199 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

The CASTLE 2024 Dataset: Advancing the Art of Multimodal Understanding