3D-TDC: A 3D temporal dilation convolution framework for video action recognition

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

3D-TDC: A 3D temporal dilation convolution framework for video action recognition

Ming, Y; Feng, F; Li, C; Xue, J-H; (2021) 3D-TDC: A 3D temporal dilation convolution framework for video action recognition. Neurocomputing , 450 pp. 362-371. 10.1016/j.neucom.2021.03.120. Green open access

[thumbnail of NEUCOM-YueMing-3DTDC-accepted.pdf]

Preview

Text
NEUCOM-YueMing-3DTDC-accepted.pdf - Accepted Version
Download (2MB) | Preview

Abstract

Video action recognition is a vital area of computer vision. By adding temporal dimension into convolution structure, 3D convolution neural network owns the capacity to extract spatio-temporal features from videos. However, due to computing constraints, it is hard to input the whole video into the convolution network at one time, resulting in a limited temporal receptive field of the network. To address this issue, we propose a novel 3D temporal dilation convolution (3D-TDC) framework, to extract spatio-temporal features of actions from videos. First, we deploy the 3D temporal dilation convolution as the shallow temporal compression layer, enabling an effective capture of spatio-temporal information in a larger time domain with the reduced computational load. Then, an action recognition framework is constructed by integrating two networks with different temporal receptive fields to balance the long-short time difference. We conduct extensive experiments on three widely-used public datasets (UCF-101, HMDB-51, and Kinetics-400) for performance evaluation, and the experimental results demonstrate the effectiveness of our proposed framework in video action recognition with low computational load.

Type:	Article
Title:	3D-TDC: A 3D temporal dilation convolution framework for video action recognition
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1016/j.neucom.2021.03.120
Publisher version:	https://doi.org/10.1016/j.neucom.2021.03.120
Language:	English
Additional information:	This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
Keywords:	3D convolution, temporal dilation, action recognition, temporal compression
UCL classification:	UCL UCL > Provost and Vice Provost Offices UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI:	https://discovery-pp.ucl.ac.uk/id/eprint/10125669

Downloads since deposit

5,796Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item