Aminian, Gholamali;
Bu, Yuheng;
Toni, Laura;
Rodrigues, Miguel RD;
Wornell, Gregory W;
(2023)
Information-Theoretic Characterizations of Generalization Error for the Gibbs Algorithm.
IEEE Transactions on Information Theory
10.1109/tit.2023.3329617.
(In press).
Preview |
Text
Final-Information-Theoretic_Characterizations_of_Generalization_Error_for_the_Gibbs_Algorithm copy.pdf - Accepted Version Download (684kB) | Preview |
Abstract
Various approaches have been developed to upper bound the generalization error of a supervised learning algorithm. However, existing bounds are often loose and even vacuous when evaluated in practice. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contributions are exact characterizations of the expected generalization error of the well-known Gibbs algorithm (a.k.a. Gibbs posterior) using different information measures, in particular, the symmetrized KL information between the input training samples and the output hypothesis. Our result can be applied to tighten existing expected generalization errors and PAC-Bayesian bounds. Our information-theoretic approach is versatile, as it also characterizes the generalization error of the Gibbs algorithm with a data-dependent regularizer and that of the Gibbs algorithm in the asymptotic regime, where it converges to the standard empirical risk minimization algorithm. Of particular relevance, our results highlight the role the symmetrized KL information plays in controlling the generalization error of the Gibbs algorithm.
Type: | Article |
---|---|
Title: | Information-Theoretic Characterizations of Generalization Error for the Gibbs Algorithm |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1109/tit.2023.3329617 |
Publisher version: | https://doi.org/10.1109/TIT.2023.3329617 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | Empirical risk minimization, generalization error, Gibbs algorithm, PAC-Bayesian learning, symmetrized KL information. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10180593 |
Archive Staff Only
![]() |
View Item |