UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

Lyu, Zhaoyan; Aminian, Gholamali; Rodrigues, Miguel RD; (2023) On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches. Entropy , 25 (7) , Article 1063. 10.3390/e25071063. Green open access

[thumbnail of entropy-25-01063.pdf]
Preview
PDF
entropy-25-01063.pdf - Published Version

Download (2MB) | Preview

Abstract

It is well-known that a neural network learning process—along with its connections to fitting, compression, and generalization—is not yet well understood. In this paper, we propose a novel approach to capturing such neural network dynamics using information-bottleneck-type techniques, involving the replacement of mutual information measures (which are notoriously difficult to estimate in high-dimensional spaces) by other more tractable ones, including (1) the minimum mean-squared error associated with the reconstruction of the network input data from some intermediate network representation and (2) the cross-entropy associated with a certain class label given some network representation. We then conducted an empirical study in order to ascertain how different network models, network learning algorithms, and datasets may affect the learning dynamics. Our experiments show that our proposed approach appears to be more reliable in comparison with classical information bottleneck ones in capturing network dynamics during both the training and testing phases. Our experiments also reveal that the fitting and compression phases exist regardless of the choice of activation function. Additionally, our findings suggest that model architectures, training algorithms, and datasets that lead to better generalization tend to exhibit more pronounced fitting and compression phases.

Type: Article
Title: On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches
Open access status: An open access version is available from UCL Discovery
DOI: 10.3390/e25071063
Publisher version: https://doi.org/10.3390/e25071063
Language: English
Additional information: © 2023 by the Authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Keywords: deep learning; information theory; information bottleneck; generalization; fitting; compression
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10173945
Downloads since deposit
45Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item