Jackson, J;
Mitra, R;
Francis, B;
Dove, I;
(2022)
On Integrating the Number of Synthetic Data Sets m into the a priori Synthesis Approach.
In:
Privacy in Statistical Databases. PSD 2022.
(pp. pp. 205-219).
Springer International Publishing: Cham, Switzerland.
Preview |
Text
PSD_2022_Revised_Jackson_et_al.pdf - Accepted Version Download (831kB) | Preview |
Abstract
The synthesis mechanism given in [4] uses saturated models, along with overdispersed count distributions, to generate synthetic categorical data. The mechanism is controlled by tuning parameters, which can be tuned according to a specific risk or utility metric. Thus expected properties of synthetic data sets can be determined analytically a priori, that is, before they are generated. While [4] considered the case of generating m=1 data set, this paper considers generating m>1 data sets. In effect, m becomes a tuning parameter and the role of m in relation to the risk-utility trade-off can be shown analytically. The paper introduces a pair of risk metrics, τ3(k,d) and τ4(k,d), that are suited to m>1 data sets; and also considers the more general issue of how best to analyse m>1 categorical data sets: average the data sets pre-analysis or average results post-analysis. Finally, the methods are demonstrated empirically with the synthesis of a constructed data set which is used to represent the English School Census.
Type: | Proceedings paper |
---|---|
Title: | On Integrating the Number of Synthetic Data Sets m into the a priori Synthesis Approach |
Event: | International Conference on Privacy in Statistical Databases - PSD 2022 |
ISBN-13: | 9783031139444 |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1007/978-3-031-13945-1_15 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | Synthetic data, privacy, categorical data, risk metrics, contingency tables |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10159958 |
Archive Staff Only
View Item |