Yap, Pau Ching;
(2023)
Approximate Bayesian Methods for Sequential Few-Shot Problems.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
PhD_Thesis_final.pdf - Accepted Version Download (18MB) | Preview |
Abstract
Neural networks are known to suffer from catastrophic forgetting when trained on sequential datasets. While there have been numerous attempts to solve this problem in large-scale supervised classification, little has been done to overcome catastrophic forgetting in few-shot classification problems. We demonstrate that the popular gradient-based model-agnostic meta-learning (MAML) algorithm indeed suffers from catastrophic forgetting. In this thesis, we introduce the Bayesian online meta-learning framework to tackle the catastrophic forgetting issue in sequential few-shot classification problems. Our framework utilises Bayesian online learning and meta-learning along with Laplace approximation and variational inference to achieve this goal. The experimental evaluations demonstrate that our framework can effectively attain this objective in comparison to various baselines. As an additional utility, we demonstrate empirically that our framework is capable of meta-learning on sequentially arriving few-shot tasks from a stationary task distribution. Laplace approximation entails Hessian computation for its Gaussian precision matrix. We extend the Kronecker-factored Hessian approximation method in the large-scale classification setting to the gradient-based meta-learning setting. The experiments illustrate the importance of this extension for a principled framework when dealing with a long sequence of few-shot problems. The final part of this thesis enhances the Bayesian online meta-learning framework for automation and flexibility in handling a greater deal of sequential few-shot problems. We utilise the long short-term memory networks (LSTMs) to automate the meta-learning quick adaptation. The enhancement considers separating the neural network structure of a model, allowing the framework to cope with different types of few-shot problems. We also incorporate a generative classifier into the enhancement to act as a pointer that informs the model about the few-shot problems it encounters.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Approximate Bayesian Methods for Sequential Few-Shot Problems |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2023. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10183563 |
Archive Staff Only
![]() |
View Item |