On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

Lyu, Zhaoyan; Aminian, Gholamali; Rodrigues, Miguel RD; (2023) On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches. Entropy , 25 (7) , Article 1063. 10.3390/e25071063. Green open access

Preview

PDF
entropy-25-01063.pdf - Published Version
Download (2MB) | Preview

Abstract

It is well-known that a neural network learning process—along with its connections to fitting, compression, and generalization—is not yet well understood. In this paper, we propose a novel approach to capturing such neural network dynamics using information-bottleneck-type techniques, involving the replacement of mutual information measures (which are notoriously difficult to estimate in high-dimensional spaces) by other more tractable ones, including (1) the minimum mean-squared error associated with the reconstruction of the network input data from some intermediate network representation and (2) the cross-entropy associated with a certain class label given some network representation. We then conducted an empirical study in order to ascertain how different network models, network learning algorithms, and datasets may affect the learning dynamics. Our experiments show that our proposed approach appears to be more reliable in comparison with classical information bottleneck ones in capturing network dynamics during both the training and testing phases. Our experiments also reveal that the fitting and compression phases exist regardless of the choice of activation function. Additionally, our findings suggest that model architectures, training algorithms, and datasets that lead to better generalization tend to exhibit more pronounced fitting and compression phases.

Type:	Article
Title:	On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
Open access status:	An open access version is available from UCL Discovery
DOI:	10.3390/e25071063
Publisher version:	https://doi.org/10.3390/e25071063
Language:	English
Additional information:	© 2023 by the Authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Keywords:	deep learning; information theory; information bottleneck; generalization; fitting; compression
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng
URI:	https://discovery-pp.ucl.ac.uk/id/eprint/10173945

Downloads since deposit

540Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item