self-determined research funds of CCNU from the colleges' basic research and operation of MOE; National Natural Science Foundation of ChinaNational Natural Science Foundation of China (NSFC) [62002130]; Hubei Province Technological Innovation Major Project [2019AAA049]
机构署名:
本校为第一且通讯机构
院系归属:
计算机学院
摘要:
The high redundancy among keyframes is a critical issue for the existing summarizing methods in dealing with user-created videos. To address the critical issue, we present an unsupervised learning method, Spatial Attention Model guided Bi-directional Long Short-term Memory network (Bi-LSTM), on the combination of visual and semantic features. As for the visual feature, we design a Salient-Area-Size-based spatial attention model on the observation that humans tend to focus on sizable and moving objects in videos. Moreover, the Bi-LSTM network is leveraged to exploit the semantic feature. Afterw...