Information theory for representation learning
| Authors | |
|---|---|
| Supervisors | |
| Cosupervisors | |
| Award date | 23-06-2025 |
| Number of pages | 200 |
| Organisations |
|
| Abstract |
This thesis explores the application of information theory to deep representation learning, with the aim of enhancing the understanding and generalization capabilities of deep learning models. Central to this work is the challenge of identifying and discarding unnecessary information, allowing models to capture the most relevant aspects of input data without requiring explicit label supervision.
We develop new theoretical frameworks and practical learning objectives that extend classical information-theoretic principles to a range of machine learning scenarios, including self-supervised learning, multi-view learning, and temporal modeling. These methods address key challenges in mutual information estimation, redundancy reduction, and efficient representation extraction, providing more robust and interpretable representations. In particular, we demonstrate how these approaches can improve the handling of complex, high-dimensional data and capture the essential dynamics of time-dependent systems, and can describe and mitigate the effect of distribution shifts. Through extensive empirical validation, we show that information-theoretic methods can effectively balance compression and prediction, supporting more adaptable and data-efficient machine learning systems. Collectively, this work advances the theoretical understanding and practical application of information theory in deep learning, offering new insights into the nature of effective representations and the challenges of learning from complex data. |
| Document type | PhD thesis |
| Language | English |
| Downloads | |
| Permalink to this page | |