Metric Learning for Categorical and Ambiguous Features: An Adversarial Method

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Metric Learning for Categorical and Ambiguous Features: An Adversarial Method

Yang, X; Dong, M; Guo, Y; Xue, J; (2021) Metric Learning for Categorical and Ambiguous Features: An Adversarial Method. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases ECML PKDD 2020: Machine Learning and Knowledge Discovery in Databases. (pp. pp. 223-238). Springer, Cham Green open access

[thumbnail of XiaochenYang-ECML-accepted.pdf]

Preview

Text
XiaochenYang-ECML-accepted.pdf - Accepted Version
Download (1MB) | Preview

Abstract

Metric learning learns a distance metric from data and has significantly improved the classification accuracy of distance-based classifiers such as k-nearest neighbors. However, metric learning has rarely been applied to categorical data, which are prevalent in health and social sciences, but inherently difficult to classify due to high feature ambiguity and small sample size. More specifically, ambiguity arises as the boundaries between ordinal or nominal levels are not always sharply defined. In this paper, we mitigate the impact of feature ambiguity by considering the worst-case perturbation of each instance and propose to learn the Mahalanobis distance through adversarial training. The geometric interpretation shows that our method dynamically divides the instance space into three regions and exploits the information on the “adversarially vulnerable” region. This information, which has not been considered in previous methods, makes our method more suitable than them for small-sized data. Moreover, we establish the generalization bound for a general form of adversarial training. It suggests that the sample complexity rate remains at the same order as that of standard training only if the Mahalanobis distance is regularized with the elementwise 1-norm. Experiments on ordinal and mixed ordinal-and-nominal datasets demonstrate the effectiveness of the proposed method when encountering the problems of high feature ambiguity and small sample size.

Type:	Proceedings paper
Title:	Metric Learning for Categorical and Ambiguous Features: An Adversarial Method
Event:	European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
Dates:	14 September 2020 - 18 September 2020
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1007/978-3-030-67661-2_14
Publisher version:	https://doi.org/10.1007/978-3-030-67661-2_14
Language:	English
Additional information:	This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords:	Metric learning, Categorical data, Adversarial training
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI:	https://discovery-pp.ucl.ac.uk/id/eprint/10110886

Downloads since deposit

9,457Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item