Chen, Y;
Marchisio, K;
Raileanu, R;
Adelani, DI;
Stenetorp, P;
Riedel, S;
Artetxe, M;
(2023)
Improving Language Plasticity via Pretraining with Active Forgetting.
In:
Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023).
NeurIPS
Preview |
PDF
9064_improving_language_plasticity_.pdf - Published Version Download (4MB) | Preview |
Abstract
Pretrained language models (PLMs) are today the primary model for natural language processing. Despite their impressive downstream performance, it can be difficult to apply PLMs to new languages, a barrier to making their capabilities universally accessible. While prior work has shown it possible to address this issue by learning a new embedding layer for the new language, doing so is both data and compute inefficient. We propose to use an active forgetting mechanism during pretraining, as a simple way of creating PLMs that can quickly adapt to new languages. Concretely, by resetting the embedding layer every K updates during pretraining, we encourage the PLM to improve its ability of learning new embeddings within limited number of updates, similar to a meta-learning effect. Experiments with RoBERTa show that models pretrained with our forgetting mechanism not only demonstrate faster convergence during language adaptation, but also outperform standard ones in a low-data regime, particularly for languages that are distant from English. Code will be available at https://github.com/facebookresearch/language-model-plasticity.
Type: | Proceedings paper |
---|---|
Title: | Improving Language Plasticity via Pretraining with Active Forgetting |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://proceedings.neurips.cc/paper_files/paper/2... |
Language: | English |
Additional information: | This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10193856 |
Archive Staff Only
![]() |
View Item |