UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Understanding Unimodal Bias in Multimodal Deep Linear Networks

Zhang, Yedi; Latham, Peter E; Saxe, Andrew M; (2024) Understanding Unimodal Bias in Multimodal Deep Linear Networks. In: Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix, (eds.) Proceedings of the 41st International Conference on Machine Learning. (pp. pp. 59100-59125). Proceedings of Machine Learning Research (PMLR): Vienna, Austria. Green open access

[thumbnail of zhang24aa.pdf]
Preview
Text
zhang24aa.pdf - Published Version

Download (1MB) | Preview

Abstract

Using multiple input streams simultaneously to train multimodal neural networks is intuitively advantageous but practically challenging. A key challenge is unimodal bias, where a network overly relies on one modality and ignores others during joint training. We develop a theory of unimodal bias with multimodal deep linear networks to understand how architecture and data statistics influence this bias. This is the first work to calculate the duration of the unimodal phase in learning as a function of the depth at which modalities are fused within the network, dataset statistics, and initialization. We show that the deeper the layer at which fusion occurs, the longer the unimodal phase. A long unimodal phase can lead to a generalization deficit and permanent unimodal bias in the overparametrized regime. Our results, derived for multimodal linear networks, extend to nonlinear networks in certain settings. Taken together, this work illuminates pathologies of multimodal learning under joint training, showing that late and intermediate fusion architectures can give rise to long unimodal phases and permanent unimodal bias. Our code is available at: https://yedizhang.github.io/unimodal-bias.html.

Type: Proceedings paper
Title: Understanding Unimodal Bias in Multimodal Deep Linear Networks
Event: 41st International Conference on Machine Learning
Location: Vienna, Austria
Dates: 21 Jul 2024 - 27 Jul 2024
Open access status: An open access version is available from UCL Discovery
Publisher version: https://proceedings.mlr.press/v235/zhang24aa.html
Language: English
Additional information: This is an Open Access paper published under a Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/).
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Gatsby Computational Neurosci Unit
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10195199
Downloads since deposit
52Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item