UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

REDUCR: Robust Data Downsampling Using Class Priority Reweighting

Bankes, William; Hughes, George; Bogunovic, Ilija; Wang, Zi; (2024) REDUCR: Robust Data Downsampling Using Class Priority Reweighting. In: Proceedings of the 12th International Conference on Learning Representations. (pp. pp. 1-23). ICLR (International Conference on Learning Representations) (In press). Green open access

[thumbnail of Bankes_Robust Data Downsampling Using Class Priority Reweighting_AAM.pdf]
Preview
Text
Bankes_Robust Data Downsampling Using Class Priority Reweighting_AAM.pdf

Download (1MB) | Preview

Abstract

Modern machine learning models are becoming increasingly expensive to train for real-world image and text classification tasks, where massive web-scale data is collected in a streaming fashion. To reduce the training cost, online batch selection techniques have been developed to choose the most informative datapoints. However, these techniques can suffer from poor worst-class generalization performance due to class imbalance and distributional shifts. This work introduces REDUCR, a robust and efficient data downsampling method that uses class priority reweighting. REDUCR reduces the training data while preserving worst-class generalization performance. REDUCR assigns priority weights to datapoints in a class-aware manner using an online learning algorithm. We demonstrate the data efficiency and robust performance of REDUCR on vision and text classification tasks. On web-scraped datasets with imbalanced class distributions, REDUCR significantly improves worst-class test accuracy (and average accuracy), surpassing state-of-the-art methods by around 15%.

Type: Proceedings paper
Title: REDUCR: Robust Data Downsampling Using Class Priority Reweighting
Event: 12th International Conference on Learning Representations
Open access status: An open access version is available from UCL Discovery
Publisher version: https://openreview.net/pdf?id=nKYTiJhhAu
Language: English
Additional information: © The Author 2024. Original content in this paper is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/).
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Electronic and Electrical Eng
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10202582
Downloads since deposit
55Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item