期刊:
Journal of Computing in Higher Education,2023年35(3):487-520 ISSN:1042-1726
通讯作者:
Lingyun Kang
作者机构:
[Yang, Zongkai; Liu, Sannyuya; Liu, Zhi; Zhao, Liang; Kang, Lingyun; Su, Zhu] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.;[Yang, Zongkai; Liu, Sannyuya; Liu, Zhi; Zhao, Liang; Su, Zhu] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan, Hubei, Peoples R China.
通讯机构:
[Lingyun Kang] N;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, People’s Republic of China
摘要:
Understanding the relationship between interactive behaviours and discourse content has critical implications for instructors' design and facilitation of collaborative discussion activities in the online discussion forum (ODF). This paper adopts social network analysis (SNA) and epistemic network analysis (ENA) methods to jointly investigate the relationships between students' network characteristics, discussion topics, and learning outcomes in a course discussion forum. Discourse data from 207 participants were included in this study. The findings indicated that (1) the interactive network generated in the collaborative discussion activities was sparsely connected, and there was limited information exchange between instructors and students; (2) students' discussion topics were mainly related to the learning content; (3) compared with the isolated group, students in the leader, mediator, and animator groups were more concerned about topics related to the learning content; and (4) students who discussed more topics related to the learning content performed better than the students who discussed more topics related to learning methods and social interactions. The learning outcomes of the influencer and leader groups were significantly higher than those of the peripheral and isolated groups. However, there was no significant correlation between students' individual centrality and their learning outcomes. The findings enrich the ODF research on the comprehensive identification of interactive behaviours and discourse content in the process of collaborative discussion activities and on the discussion topic differences between different role groups. The study findings also have practical implications for instructors to design effective instructional interventions aimed at improving the quality of collaboration in the ODF.
作者:
Hai Liu;Cheng Zhang;Yongjian Deng;Bochen Xie;Tingting Liu;...
期刊:
IEEE Transactions on Multimedia,2023年:1-14 ISSN:1520-9210
作者机构:
[Hai Liu; Cheng Zhang; Zhaoli Zhang] National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China;[Yongjian Deng] College of Computer Science, Beijing University of Technology, Beijing, China;Department of Mechanical Engineering, City University of Hong Kong, Kowloon, Hong Kong;City University of Hong Kong Shenzhen Research Institute, Shenzhen, China;School of Education, Hubei University, Wuhan, Hubei, China
摘要:
Fine-grained bird image classification (FBIC) is not only meaningful for endangered bird observation and protection but also a prevalent task for image classification in multimedia processing and computer vision. However, FBIC suffers from several challenges, such as bird molting, complex background, and arbitrary bird posture. To effectively tackle these challenges, we present a novel invariant cues-aware feature concentration Transformer (TransIFC), which learns invariant and core information in bird images. To this end, two novel modules are proposed to leverage the characteristics of bird images, namely, the hierarchy stage feature aggregation (HSFA) module and the feature in feature abstraction (FFA) module. The HSFA module aggregates the multiscale information of bird images by concatenating multilayer features. The FFA module extracts the invariant cues of birds through feature selection based on discrimination scores. Transformer is employed as the backbone to reveal the long-dependent semantic relationships in bird images. Moreover, abundant visualizations are provided to prove the interpretability of the HSFA and FFA modules in TransIFC. Comprehensive experiments demonstrate that TransIFC can achieve state-of-the-art performance on the CUB-200-2011 dataset (91.0%) and the NABirds dataset (90.9%). Finally, extended experiments have been conducted on the Stanford Cars dataset to suggest the potential of generalizing our method on other fine-grained visual classification tasks.
作者:
Yang, Shuoqiu;Du, Xu;Tang, Hengtao;Hung, Jui-Long;Tang, Yeye
期刊:
Education and Information Technologies,2023年:1-28 ISSN:1360-2357
通讯作者:
Tang, HT
作者机构:
[Yang, Shuoqiu; Du, Xu; Tang, Yeye] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan, Peoples R China.;[Tang, Hengtao; Tang, HT] Univ South Carolina, Dept Leadership Learning Design & Inquiry, Columbia, SC 29208 USA.;[Hung, Jui-Long] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan, Peoples R China.;[Hung, Jui-Long] Boise State Univ, Dept Educ Technol, Boise, ID USA.
通讯机构:
[Tang, HT ] U;Univ South Carolina, Dept Leadership Learning Design & Inquiry, Columbia, SC 29208 USA.
关键词:
Collaborative problem solving;Group interaction;Group interaction density;Collaborative performance
摘要:
Collaborative Problem Solving (CPS) has received increasing attention for its role in promoting learners' cognitive and social development in STEM education. However, little is known about how learners interact dynamically within a group at different time granularities. This gap mainly resulted from overlooking the time dimension of interactions, leading to a lack of nuanced understanding of moment-to-moment interaction in CPS. In this study, we demonstrated the potential of temporal group interaction density in modeling online CPS interactions and investigated the impact of temporal interaction density on CPS processes and outcomes. Specifically, we proposed using cumulative weighted density to measure the holistic state of group interactions and explained the differences in group interactions with different collaborative performance and interaction densities by modeling the transition and evolution of interaction sequences through Apriori and cumulative relative centrality. Results indicated that group interaction density cannot directly predict their collaborative performance, but notable differences in interaction patterns existed in the high-performance groups with different interaction densities, while low-performance groups showed interactive commonalities towards the completion of CPS. The findings of this study guided the design of CPS interventions and supported the process mining of CPS interactions, with vital practical implications for CPS assessment and skills development.
作者机构:
[Yang, Zongkai; Liu, Sannyuya; Liu, Zhi; Peng, Xian] Cent China Normal Univ, Fac Artificial Intelligence Educ, Natl Engn Res Ctr Educ Big Data, Wuhan, Peoples R China.;[Yang, Zongkai; Liu, Sannyuya; Zhang, Ning] Cent China Normal Univ, Fac Artificial Intelligence Educ, Natl Engn Res Ctr Elearning, Wuhan, Peoples R China.
通讯机构:
[Peng, X.] N;National Engineering Research Center for Educational Big Data, China
期刊:
User Modeling and User-Adapted Interaction,2023年:1-33 ISSN:0924-1868
通讯作者:
Liang, RX
作者机构:
[Shen, Xiaoxuan; Yang, Zongkai; Liu, Sannyuya; Li, Qing; Liang, Ruxia; Du, Shangheng; Sun, Jianwen] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[Shen, Xiaoxuan; Yang, Zongkai; Liu, Sannyuya; Li, Qing; Liang, Ruxia; Du, Shangheng; Sun, Jianwen] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Yang, Zongkai; Liu, Sannyuya] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.
通讯机构:
[Liang, RX ] C;Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.
关键词:
Recommender systems;Group recommender systems;Adversarial learning;Knowledge transfer
摘要:
Many online services allow users to participate in various group activities such as online meeting or group buying and thus need to provide user groups with services that they are interested. The group recommender systems emerge as required and provide personalized services for various online user groups. Data sparsity is an important issue in group recommender systems, since even fewer group-item interactions are observed. Transfer learning has been one efficient tool to alleviate the data sparsity issue in recommender systems for individual users, but have not been utilized for group recommendation. Moreover, the group and the group members have complex and mutual relationships with each other, which exacerbates the difficulty in modelling the preferences of both a group and its members for recommendation. Therefore, group recommender systems face three main challenges that may significantly impact its quality and accuracy: (1) taking consideration of group member relationship and their interactions in modelling user and group preferences; (2) ensuring latent feature spaces between the users and groups are maximally matched; and (3) constructing a deep group recommendation method that both the individual user and group domains can benefit from a knowledge exchange. Hence, in this paper, we propose a deep adversarial group recommendation method, called DA-GR. User feature are separated into two subspaces to ensure only consistent group members’ feature knowledge can be extracted and shared with group preference modelling. Adversarial learning is used to effectively transfer consistent knowledge from individual user interactions to the group interaction domain through the bridge of group-user relationships. Extensive experiments, which demonstrate the effectiveness and superiority of our proposal, providing accurate recommendation for both individual users and groups, are conducted on public datasets. The source code of DA-GR is in
https://github.com/ccnu-mathits/DA-GR
.
摘要:
Computer-supported collaborative concept mapping (CSCCM) integrates technology and concept mapping to support students’ knowledge understanding, and much research on the behavioral patterns involved in CSCCM activities has been conducted. However, there is limited understanding of the differences in knowledge understanding and behavioral patterns between students with different levels of collaboration perception. This study examined the impact of students’ perceptions of collaboration on their knowledge understanding and behavioral patterns in the CSCCM activity. A total of 36 individuals from the same university participated in this study. The findings suggested that compared with students with a low level of collaborative perception, students with a high level of collaborative perception could obtain better conceptual knowledge understanding. However, there was no significant difference in factual knowledge understanding between students with different levels of collaboration perception. For behavioral patterns, students with a high level of collaboration perception demonstrated more diverse behavioral transition sequences, students with a middle level of collaboration perception demonstrated more repetitive behavioral sequences, and students with a low level of collaboration perception demonstrated less behavioral transition sequences. The findings of this research can provide a reference for teachers to design CSCCM activities in the classroom.
期刊:
LIBRARY HI TECH,2023年41(4):1039-1062 ISSN:0737-8831
作者机构:
[Li, Yating] National Engineering Laboratory for Educational Big Data, Central China Normal University, Wuhan, China;[Zhou, Chi; Wu, Di; Chen, Min] National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China
关键词:
Teachers’ information literacy;Information literacy evaluation;Online information behavior;Online learning and teaching platform;Process evaluation;Supervised learning models
摘要:
Purpose
Advances in information technology now permit the recording of massive and diverse process data, thereby making data-driven evaluations possible. This study discusses whether teachers’ information literacy can be evaluated based on their online information behaviors on online learning and teaching platforms (OLTPs).
Design/methodology/approach
First, to evaluate teachers’ information literacy, the process data were combined from teachers on OLTP to describe nine third-level indicators from the richness, diversity, usefulness and timeliness analysis dimensions. Second, propensity score matching (PSM) and difference tests were used to analyze the differences between the performance groups with reduced selection bias. Third, to effectively predict the information literacy score of each teacher, four sets of input variables were used for prediction using supervised learning models.
Findings
The results show that the high-performance group performs better than the low-performance group in 6 indicators. In addition, information-based teaching and behavioral research data can best reflect the level of information literacy. In the future, greater in-depth explorations are needed with richer online information behavioral data and a more effective evaluation model to increase evaluation accuracy.
Originality/value
The evaluation based on online information behaviors has concrete application scenarios, positively correlated results and prediction interpretability. Therefore, information literacy evaluations based on behaviors have great potential and favorable prospects.
期刊:
Journal of Autism and Developmental Disorders,2023年53(6):2314-2327 ISSN:0162-3257
通讯作者:
Jingying Chen
作者机构:
[Chen, Xianke; Chen, Jingying] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Hubei, Peoples R China.;[Chen, Xianke; Chen, Jingying] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.;[Liao, Mengyi] Pingdingshan Univ, Coll Comp Sci & Technol, Pingdingshan 467000, Henan, Peoples R China.;[Wang, Guangshuai] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Hubei, Peoples R China.
通讯机构:
[Jingying Chen] N;National Engineering Laboratory for Educational Big Data, Central China Normal University, Wuhan, People’s Republic of China<&wdkj&>National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, People’s Republic of China
期刊:
IEEE Transactions on Medical Imaging,2023年42(3):762-773 ISSN:0278-0062
通讯作者:
Cai, C;Wu, W
作者机构:
[Cai, Chang; Cai, C] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Kang, Huicong] Huazhong Univ Sci & Technol, Tongji Hosp, Tongji Med Coll, Dept Neurol, Wuhan 430079, Hubei, Peoples R China.;[Hashemi, Ali] Tech Univ Berlin, Uncertainty Inverse Modeling & Machine Learning Gr, D-10587 Berlin, Germany.;[Hashemi, Ali] Tech Univ Berlin, Inst Software Engn & Theoret Comp Sci, Fac Elect Engn & Comp Sci 4, Machine Learning Grp, D-10587 Berlin, Germany.;[Chen, Dan] Wuhan Univ, Sch Comp Sci, Wuhan 430079, Peoples R China.
通讯机构:
[Wu, W ] A;[Cai, C ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;Alto Neurosci Inc, Los Altos, CA 94022 USA.
摘要:
Simultaneously estimating brain source activity and noise has long been a challenging task in electromagnetic brain imaging using magneto- and electroencephalography. The problem is challenging not only in terms of solving the NP-hard inverse problem of reconstructing unknown brain activity across thousands of voxels from a limited number of sensors, but also for the need to simultaneously estimate the noise and interference. We present a generative model with an augmented leadfield matrix to simultaneously estimate brain source activity and sensor noise statistics in electromagnetic brain imaging (EBI). We then derive three Bayesian inference algorithms for this generative model (expectation-maximization (EBI-EM), convex bounding (EBI-Convex) and fixed-point (EBI-Mackay)) to simultaneously estimate the hyperparameters of the prior distribution for brain source activity and sensor noise. A comprehensive performance evaluation for these three algorithms is performed. Simulations consistently show that the performance of EBI-Convex and EBI-Mackay updates is superior to that of EBI-EM. In contrast to the EBI-EM algorithm, both EBI-Convex and EBI-Mackay updates are quite robust to initialization, and are computationally efficient with fast convergence in the presence of both Gaussian and real brain noise. We also demonstrate that EBI-Convex and EBI-Mackay update algorithms can reconstruct complex brain activity with only a few trials of sensor data, and for resting-state data, achieving significant improvement in source reconstruction and noise learning for electromagnetic brain imaging.
摘要:
We present a keyphrase extraction algorithm named TopicLPRank in this paper, which is an improved TopicRank algorithm. Different from the TopicRank which only uses the relative distance information of the text, we think that the length and absolute position of the text candidate keyphrases also have a certain influence on the results of the model for extraction keyphrases. Therefore, the proposed TopicLPRank incorporates these two factors on the basis of the TopicRank. The experimental results show that adding the location information and length information of candidate keyphrases can, respectively, increase the F-Score of the model by around 2.7
$$\%$$
points and 1.7
$$\%$$
points, which is equivalent to an increase of 19.6 and 12.3
$$\%$$
compared with the TopicRank. At the same time, the fusion of the length and location information of the candidate keyphrase can increase the F-Score by around 3.5 percentage points, which is equivalent to an increase of 25.21
$$\%$$
compared with the TopicRank in the dataset NUS.
作者:
Chen, Min;Liu, Yanqiu;Yang, Harrison Hao;Li, Yating;Zhou, Chi
期刊:
Education and Information Technologies,2023年28(11):15011-15030 ISSN:1360-2357
通讯作者:
Chi Zhou
作者机构:
[Zhou, Chi; Chen, Min] Cent China Normal Univ, Educ Informatizat Strategy Res Base Minist Educ, Wuhan 430079, Hubei, Peoples R China.;[Li, Yating; Chen, Min] Cent China Normal Univ, Technol Comm Minist Educ, Res Ctr Sci & Technol Promoting Educ Innovat & Dev, Ctr Strateg Studies Sci, Wuhan 430079, Hubei, Peoples R China.;[Liu, Yanqiu] Cent China Normal Univ, Key Res Inst Humanities & Social Sci Hubei Prov, Hubei Res Ctr Educ Informatizat Dev, Wuhan 430079, Hubei, Peoples R China.;[Li, Yating; Zhou, Chi; Liu, Yanqiu] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Hubei, Peoples R China.;[Yang, Harrison Hao] SUNY Coll Oswego, Sch Educ, Oswego, NY 60543 USA.
通讯机构:
[Chi Zhou] E;Educational Informatization Strategy Research Base of Ministry of Education, Central China Normal University, Wuhan, China<&wdkj&>National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China
关键词:
Online teacher professional development;Teacher participation;Participation frequency;Participation quality;Lag sequential analysis
期刊:
Information Processing & Management,2023年60(4):103350 ISSN:0306-4573
通讯作者:
Li, DTC;Shi, FB
作者机构:
[Zheng, Chao; Wang, Jian; Li, Duantengchuan; Wang, Jingxiong; Li, Bing] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China.;[Zhang, Qi] Cent China Normal Univ, Sch Informat Management, Wuhan, Peoples R China.;[Shi, Fobo; Shi, FB] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Cai, Yuefeng] ZTE Corp, Wuhan 430223, Peoples R China.;[Wang, Xiaoguang; Zhang, Zhen] Wuhan Univ, Sch Informat Management, Wuhan, Peoples R China.
通讯机构:
[Li, DTC ] W;[Shi, FB ] C;Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China.;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.
摘要:
Knowledge graphs are sizeable graph-structured knowledge with both abstract and concrete concepts in the form of entities and relations. Recently, convolutional neural networks have achieved outstanding results for more expressive representations of knowledge graphs. However, existing deep learning-based models exploit semantic information from single-level feature interaction, potentially limiting expressiveness. We propose a knowledge graph embedding model with an attention-based high-low level features interaction convolutional network called ConvHLE to alleviate this issue. This model effectively harvests richer semantic information and generates more expressive representations. Concretely, the multilayer convolutional neural network is utilized to fuse high-low level features. Then, features in fused feature maps interact with other informative neighbors through the criss-cross attention mechanism, which expands the receptive fields and boosts the quality of interactions. Finally, a plausibility score function is proposed for the evaluation of our model. The performance of ConvHLE is experimentally investigated on six benchmark datasets with individual characteristics. Extensive experimental results prove that ConvHLE learns more expressive and discriminative feature representations and has outperformed other state-of-the-art baselines over most metrics when addressing link prediction tasks. Comparing MRR and Hits@1 on FB15K-237, our model outperforms the baseline ConvE by 13.5% and 16.0%, respectively.
作者:
He, Xiuling;Fang, Jing;Cheng, Hercy N. H.;Men, Qibin;Li, Yangyang
期刊:
Education and Information Technologies,2023年28(9):11401-11422 ISSN:1360-2357
通讯作者:
Jing Fang
作者机构:
[He, Xiuling; Men, Qibin] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan, Hubei, Peoples R China.;[Fang, Jing; Li, Yangyang] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan, Hubei, Peoples R China.;[Cheng, Hercy N. H.] Taipei Med Univ, Ctr Gen Educ, Taipei City, Taiwan.
通讯机构:
[Jing Fang] N;National Engineering Research Center for E-learning, Central China Normal University, Wuhan City, China
期刊:
Measurement Science And Technology,2023年34(1) ISSN:0957-0233
作者机构:
[He, Lamei; Li, Xiaonian] Longdong Univ, Sch Informat Engn, Qingyang, Gansu, Peoples R China.;[Dai, Zhicheng] Cent China Normal Univ, Fac Artificial Intelligence Educ, Natl Engn Res Ctr E Learning, Wuhan, Hubei, Peoples R China.
摘要:
There are two problems with traditional indoor fingerprint location methods. First, irrelevant fingerprints in a fingerprint database interfere with the matching phase, which leads to poor positioning accuracy and stability of positioning results, and second, there is a large amount of computational overhead in the matching phase. Therefore, this paper proposes a K-nearest neighbor indoor fingerprint location method based on coarse positioning circular domain and the highest similarity threshold. In this method, a circular domain is formed in a coarse positioning process to narrow the positioning range. It solves the problem of the interference of irrelevant fingerprints. At the same time, a fault-tolerant mechanism is introduced to adjust the circular domain dynamically to ensure that the coarse positioning circular domain contains high similarity reference points and improve the fault tolerance of the coarse positioning. This method consists of offline and online phases. In the offline phase, the values of the received signal strength from Bluetooth low energy are preprocessed using a Gaussian filter to construct a fingerprint database. In the online phase, irrelevant fingerprints are filtered out by using the coarse positioning method. The filtered fingerprints are then matched with a testing point by the K-nearest neighbor algorithm, and the weighted centroids of the nearest reference points are solved. Finally, the coordinate of the testing point is obtained. The experimental results show that this method can effectively improve indoor positioning accuracy when compared with the traditional K-nearest neighbor. The average positioning error of the proposed method is 0.844 m.
作者:
Du, Xu;Dai, Miao;Tang, Hengtao;Hung, Jui-Long;Li, Hao;...
期刊:
Journal of Computing in Higher Education,2023年35(2):272-295 ISSN:1042-1726
通讯作者:
Hengtao Tang
作者机构:
[Dai, Miao; Hung, Jui-Long; Li, Hao; Du, Xu] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Tang, Hengtao] Univ South Carolina, Dept Educ Studies, Columbia, SC 29208 USA.;[Hung, Jui-Long] Boise State Univ, Dept Educ Technol, Boise, ID USA.;[Zheng, Jinqiu] Guangdong Med Univ, Dongguan 523808, Guangdong, Peoples R China.
通讯机构:
[Hengtao Tang] D;Department of Educational Studies, University of South Carolina, Columbia, SC, United States
关键词:
Cognitive load;Collaborative problem solving;Computer networking;Virtual experimentation;Online learning
期刊:
Artificial Intelligence in Medicine,2023年145:102677 ISSN:0933-3657
通讯作者:
Jiang, XP
作者机构:
[Fu, Chengcheng] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Jiang, Xingpeng; Fu, Chengcheng; He, Tingting] Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.;[Fu, Chengcheng; van Harmelen, Frank; Huang, Zhisheng] Vrije Univ Amsterdam, Dept Comp Sci, Amsterdam, Netherlands.;[Fu, Chengcheng; He, Tingting; Jiang, Xingpeng] Cent China Normal Univ, Natl Language Resources Monitor Res Ctr Network Me, Wuhan, Peoples R China.;[Huang, Zhisheng] Tongji Univ, Sch Med, Clin Res Ctr Mental Disorders, Shanghai Pudong New Area Mental Hlth Ctr, Shanghai, Peoples R China.
通讯机构:
[Jiang, XP ] C;Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.
关键词:
Food;Gut microbiota;Knowledge graph;Mental health
期刊:
AI COMMUNICATIONS,2023年36(3):219-233 ISSN:0921-7126
通讯作者:
Liao, SB
作者机构:
[Liao, Shengbin; Liao, SB] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Wang, Xiaofeng; Yang, ZongKai] Cent China Normal Univ, Natl Engn Lab Educ Big Data Technol, Wuhan, Peoples R China.
通讯机构:
[Liao, SB ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.
关键词:
Human action recognition;mixed convolution;BN-Inception;two-stream network architecture
摘要:
The most widely used two-stream architectures and building blocks for human action recognition in videos generally consist of 2D or 3D convolution neural networks. 3D convolution can abstract motion messages between video frames, which is essential for video classification. 3D convolution neural networks usually obtain good performance compared with 2D cases, however it also increases computational cost. In this paper, we propose a heterogeneous two-stream architecture which incorporates two convolutional networks. One uses a mixed convolution network (MCN), which combines some 3D convolutions in the middle of 2D convolutions to train RGB frames, another one adopts BN-Inception network to train Optical Flow frames. Considering the redundancy of neighborhood video frames, we adopt a sparse sampling strategy to decrease the computational cost. Our architecture is trained and evaluated on the standard video actions benchmarks of HMDB51 and UCF101. Experimental results show our approach obtains the state-of-the-art performance on the datasets of HMDB51 (73.04%) and UCF101 (95.27%).