Liu, Ziwen;
Grau-Bove, Josep;
Orr, Scott;
(2022)
BERT-Flow-VAE: A Weakly-supervised Model for Multi-Label Text Classification.
In:
(Proceedings) COLING2022: The 29th International Conference on Computational Linguistics.
: Gyeongju, Republic of Korea.
Preview |
Text
355_Paper.pdf - Accepted Version Download (2MB) | Preview |
Abstract
Multi-label Text Classification (MLTC) is the task of categorizing documents into one or more topics. Considering the large volumes of data and varying domains of such tasks, fully-supervised learning requires manually fully annotated datasets which is costly and time-consuming. In this paper, we propose BERT-Flow-VAE (BFV), a WeaklySupervised Multi-Label Text Classification (WSMLTC) model that reduces the need for full supervision. This new model: (1) produces BERT sentence embeddings and calibrates them using a flow model, (2) generates an initial topic-document matrix by averaging results of a seeded sparse topic model and a textual entailment model that only require surface name of topics and 4-6 seed words per topic, and (3) adopts a VAE framework to reconstruct the embeddings under the guidance of the topic-document matrix. Finally, (4) it uses the means produced by the encoder model in the VAE architecture as predictions for MLTC. Experimental results on 6 multilabel datasets show that BFV can substantially outperform other baseline WSMLTC models in key metrics and achieve approximately 84% performance of a fully-supervised model.
Type: | Proceedings paper |
---|---|
Title: | BERT-Flow-VAE: A Weakly-supervised Model for Multi-Label Text Classification |
Event: | COLING2022: The 29th International Conference on Computational Linguistics |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://coling2022.org/ |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
UCL classification: | UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of the Built Environment > Bartlett School Env, Energy and Resources UCL > Provost and Vice Provost Offices > UCL BEAMS UCL |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10156145 |
Archive Staff Only
![]() |
View Item |