Semantic Cues Enhanced Multimodality Multistream CNN for Action Recognition

首页 > 成果 > 详情

认领

导出

Link by DOI

反馈

作者信息关键词期刊信息基础信息归属信息摘要

成果类型：

期刊论文

作者：

Tu, Zhigang*;Xie, Wei（谢伟）;Dauwels, Justin;Li, Baoxin;Yuan, Junsong

通讯作者：

Tu, Zhigang

作者机构：

[Tu, Zhigang] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & R, Wuhan 430079, Hubei, Peoples R China.

[Xie, Wei] Cent China Normal Univ, Sch Comp, Wuhan 430079, Hubei, Peoples R China.

[Dauwels, Justin] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 637553, Singapore.

[Li, Baoxin] Arizona State Univ, Sch Comp, Decis Syst Engn, Informat, Tempe, AZ 85287 USA.

[Yuan, Junsong] SUNY Buffalo, Dept Comp Sci & Engn, Buffalo, NY 14260 USA.

通讯机构：

[Tu, Zhigang] W

Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & R, Wuhan 430079, Hubei, Peoples R China.

语种：

英文

关键词：

Action recognition;multi-modalities;multi-stream CNN;semantic cues;spatiotemporal saliency estimation;video object detection

期刊：

IEEE Transactions on Circuits and Systems for Video Technology

ISSN：

1051-8215

年：

2019

卷：

期：

页码：

1423-1437

DOI：

10.1109/TCSVT.2018.2830102

基金类别：

Manuscript received January 17, 2018; revised April 1, 2018; accepted April 21, 2018. Date of publication April 25, 2018; date of current version May 3, 2019. This work is supported in part by the Singapore Ministry of Education Academic Research Fund Tier 2 under Grant MOE2015-T2-2-114, in part by the National Natural Science Foundation of China under Grant 61501198, in part by the Natural Science Foundation of Hubei Province under Grant 2014CFB461, and in part by the University at Buffalo. This paper was recommended by Associate Editor G.-J. Qi. (Corresponding author: Zhigang Tu.) Z. Tu is with the State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China (e-mail: tuzhigang1986@gmail.com).

机构署名：

本校为其他机构

院系归属：

计算机学院

摘要：

This paper addresses the issue of video-based action recognition by exploiting an advanced multi-stream Convolutional Neural Network (CNN) to fully use semantics-derived multiple modalities in both spatial (appearance) and temporal (motion) domains, since the performance of the CNN-based action recognition methods heavily relate to two factors: semantic visual cues and the network architecture. Our work consists of two major parts. First, to extract useful human-related semantics accurately, we propose a novel spatiotemporal saliency based video object segmentation (STS-VOS) model. By fusing d...

反馈

产权有误：本人成果被他人认领

数据有误：数据基本信息有误

归属有误：成果的院系归属、机构署名归属有误

其他原因：

验证码：

看不清楚，换一个

确定

取消

成果认领

标题：

用户	作者	通讯作者	--
	请选择	请选择	--

确定

取消

Semantic Cues Enhanced Multimodality Multistream CNN for Action Recognition

反馈

成果认领

提示

该栏目需要登录且有访问权限才可以访问