UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information

Perez, B; Musolesi, M; Stringhini, G; (2018) You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information. In: Proceedings of Twelfth International AAAI Conference on Web and Social Media. (pp. pp. 241-250). The AAAI Press: Palo Alto, CA, USA. Green open access

[thumbnail of perez_icwsm.pdf]
Preview
Text
perez_icwsm.pdf - Accepted Version
Available under License : See the attached licence file.

Download (316kB) | Preview

Abstract

Metadata are associated to most of the information we produce in our daily interactions and communication in the digital world. Yet, surprisingly, metadata are often still categorized as non-sensitive. Indeed, in the past, researchers and practitioners have mainly focused on the problem of the identification of a user from the content of a message. In this paper, we use Twitter as a case study to quantify the uniqueness of the association between metadata and user identity and to understand the effectiveness of potential obfuscation strategies. More specifically, we analyze atomic fields in the metadata and systematically combine them in an effort to classify new tweets as belonging to an account using different machine learning algorithms of increasing complexity. We demonstrate that, through the application of a supervised learning algorithm, we are able to identify any user in a group of 10,000 with approximately 96.7% accuracy. Moreover, if we broaden the scope of our search and consider the 10 most likely candidates we increase the accuracy of the model to 99.22%. We also found that data obfuscation is hard and ineffective for this type of data: even after perturbing 60% of the training data, it is still possible to classify users with an accuracy higher than 95%. These results have strong implications in terms of the design of metadata obfuscation strategies, for example for data set release, not only for Twitter, but, more generally, for most social media platforms.

Type: Proceedings paper
Title: You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information
Event: AAAI Conference on Web and Social Media (ICWSM)
Location: Palo Alto, CA
Dates: 25 June 2018 - 28 June 2018
ISBN-13: 978-1-57735-798-8
Open access status: An open access version is available from UCL Discovery
Publisher version: https://aaai.org/ocs/index.php/ICWSM/ICWSM18/paper...
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Twitter; social networks; metadata; classification; re-identification
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL SLASH
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of S&HS
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of S&HS > Dept of Geography
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10046465
Downloads since deposit
7,448Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item