Yu, Jialin;
Cristea, Alexandra I;
Harit, Anoushka;
Sun, Zhongtian;
Aduragba, Olanrewaju Tahir;
Shi, Lei;
Moubayed, Noura Al;
(2023)
Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation.
AI Open
, 4
pp. 19-32.
10.1016/j.aiopen.2023.05.001.
Preview |
Text
1-s2.0-S2666651023000025-main.pdf - Published Version Download (997kB) | Preview |
Abstract
This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin ( ; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.
Type: | Article |
---|---|
Title: | Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1016/j.aiopen.2023.05.001 |
Publisher version: | https://doi.org/10.1016/j.aiopen.2023.05.001 |
Language: | English |
Additional information: | © 2023 The Authors. Publishing services by Elsevier B.V. on behalf of KeAi Communications Co. Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
Keywords: | Deep latent variable models, Paraphrase generation, Semi-supervised learning, Natural language processing, Deep learning |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10172090 |
Archive Staff Only
![]() |
View Item |