UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Improving Language Plasticity via Pretraining with Active Forgetting

Chen, Y; Marchisio, K; Raileanu, R; Adelani, DI; Stenetorp, P; Riedel, S; Artetxe, M; (2023) Improving Language Plasticity via Pretraining with Active Forgetting. In: Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023). NeurIPS Green open access

[thumbnail of 9064_improving_language_plasticity_.pdf]
Preview
PDF
9064_improving_language_plasticity_.pdf - Published Version

Download (4MB) | Preview

Abstract

Pretrained language models (PLMs) are today the primary model for natural language processing. Despite their impressive downstream performance, it can be difficult to apply PLMs to new languages, a barrier to making their capabilities universally accessible. While prior work has shown it possible to address this issue by learning a new embedding layer for the new language, doing so is both data and compute inefficient. We propose to use an active forgetting mechanism during pretraining, as a simple way of creating PLMs that can quickly adapt to new languages. Concretely, by resetting the embedding layer every K updates during pretraining, we encourage the PLM to improve its ability of learning new embeddings within limited number of updates, similar to a meta-learning effect. Experiments with RoBERTa show that models pretrained with our forgetting mechanism not only demonstrate faster convergence during language adaptation, but also outperform standard ones in a low-data regime, particularly for languages that are distant from English. Code will be available at https://github.com/facebookresearch/language-model-plasticity.

Type: Proceedings paper
Title: Improving Language Plasticity via Pretraining with Active Forgetting
Open access status: An open access version is available from UCL Discovery
Publisher version: https://proceedings.neurips.cc/paper_files/paper/2...
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10193856
Downloads since deposit
30Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item