UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

A Complete Characterisation of Structured Missingness

Jackson, James; Mitra, Robin; Hagenbuch, Niels; Mcgough, Sarah; Harbron, Chris; (2023) A Complete Characterisation of Structured Missingness. ArXiv: Ithaca, NY, USA. Green open access

[thumbnail of SMcharacterisationpaper.pdf]
Preview
Text
SMcharacterisationpaper.pdf - Submitted Version

Download (12MB) | Preview

Abstract

Our capacity to process large complex data sources is ever-increasing, providing us with new, important applied research questions to address, such as how to handle missing values in large-scale databases. Mitra et al. (2023) noted the phenomenon of Structured Missingness (SM), which is where missingness has an underlying structure. Existing taxonomies for defining missingness mechanisms typically assume that variables' missingness indicator vectors M1, M2, ..., Mp are independent after conditioning on the relevant portion of the data matrix X. As this is often unsuitable for characterising SM in multivariate settings, we introduce a taxonomy for SM, where each Mj can depend on M−j (i.e., all missingness indicator vectors except Mj), in addition to X. We embed this new framework within the well-established decomposition of mechanisms into MCAR, MAR, and MNAR (Rubin, 1976), allowing us to recast mechanisms into a broader setting, where we can consider the combined effect of X and M−j on Mj. We also demonstrate, via simulations, the impact of SM on inference and prediction, and consider contextual instances of SM arising in a de-identified nationwide (US-based) clinico-genomic database (CGDB). We hope to stimulate interest in SM, and encourage timely research into this phenomenon.

Type: Working / discussion paper
Title: A Complete Characterisation of Structured Missingness
Open access status: An open access version is available from UCL Discovery
Publisher version: https://doi.org/10.48550/arXiv.2307.02650
Language: English
Additional information: This is an Open Access article published under a Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/).
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10201614
Downloads since deposit
36Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item