Information theory for representation learning

M. Federici

Information theory for representation learning

Authors	M. Federici
Supervisors	P. Forré
Cosupervisors	M. Welling
Award date	23-06-2025
Number of pages	200
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	This thesis explores the application of information theory to deep representation learning, with the aim of enhancing the understanding and generalization capabilities of deep learning models. Central to this work is the challenge of identifying and discarding unnecessary information, allowing models to capture the most relevant aspects of input data without requiring explicit label supervision. We develop new theoretical frameworks and practical learning objectives that extend classical information-theoretic principles to a range of machine learning scenarios, including self-supervised learning, multi-view learning, and temporal modeling. These methods address key challenges in mutual information estimation, redundancy reduction, and efficient representation extraction, providing more robust and interpretable representations. In particular, we demonstrate how these approaches can improve the handling of complex, high-dimensional data and capture the essential dynamics of time-dependent systems, and can describe and mitigate the effect of distribution shifts. Through extensive empirical validation, we show that information-theoretic methods can effectively balance compression and prediction, supporting more adaptable and data-efficient machine learning systems. Collectively, this work advances the theoretical understanding and practical application of information theory in deep learning, offering new insights into the nature of effective representations and the challenges of learning from complex data.
Document type	PhD thesis
Language	English
Downloads	Thesis
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Information theory for representation learning