作者:
Hai Liu;Cheng Zhang;Yongjian Deng;Bochen Xie;Tingting Liu;...
期刊:
IEEE Transactions on Multimedia,2023年:1-14 ISSN:1520-9210
作者机构:
[Hai Liu; Cheng Zhang; Zhaoli Zhang] National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China;[Yongjian Deng] College of Computer Science, Beijing University of Technology, Beijing, China;Department of Mechanical Engineering, City University of Hong Kong, Kowloon, Hong Kong;City University of Hong Kong Shenzhen Research Institute, Shenzhen, China;School of Education, Hubei University, Wuhan, Hubei, China
摘要:
Fine-grained bird image classification (FBIC) is not only meaningful for endangered bird observation and protection but also a prevalent task for image classification in multimedia processing and computer vision. However, FBIC suffers from several challenges, such as bird molting, complex background, and arbitrary bird posture. To effectively tackle these challenges, we present a novel invariant cues-aware feature concentration Transformer (TransIFC), which learns invariant and core information in bird images. To this end, two novel modules are proposed to leverage the characteristics of bird images, namely, the hierarchy stage feature aggregation (HSFA) module and the feature in feature abstraction (FFA) module. The HSFA module aggregates the multiscale information of bird images by concatenating multilayer features. The FFA module extracts the invariant cues of birds through feature selection based on discrimination scores. Transformer is employed as the backbone to reveal the long-dependent semantic relationships in bird images. Moreover, abundant visualizations are provided to prove the interpretability of the HSFA and FFA modules in TransIFC. Comprehensive experiments demonstrate that TransIFC can achieve state-of-the-art performance on the CUB-200-2011 dataset (91.0%) and the NABirds dataset (90.9%). Finally, extended experiments have been conducted on the Stanford Cars dataset to suggest the potential of generalizing our method on other fine-grained visual classification tasks.
摘要:
Traditional Generative Adversarial Network (GAN) based Generalized Zero Shot Learning (GZSL) methods usually suffer from a problem that these methods ignore the differences between classes when using the standard normal distribution to fit the true distribution of each category, and the incompleteness of a single adversarial training makes the model unable to capture all the characteristics of the samples. To address this problem, a data-driven recurrent adversarial generative network is proposed in this paper. We first synthe-size visual prototypes for unseen classes using the transformation from semantic attributes to visual prototypes learned on seen classes. Then, some noise is generated from these pro-totypes to synthesize the unseen samples according to the corresponding semantic attri-butes. During the sample generation process, a recurrent generative adversarial network is designed to facilitate the generated visual features to be more representative. Extensive experiments on five popular datasets as well as detailed ablation studies demon-strate the effectiveness and superiority of the proposed method.(c) 2023 Elsevier Inc. All rights reserved.
摘要:
Computer-supported collaborative concept mapping (CSCCM) integrates technology and concept mapping to support students’ knowledge understanding, and much research on the behavioral patterns involved in CSCCM activities has been conducted. However, there is limited understanding of the differences in knowledge understanding and behavioral patterns between students with different levels of collaboration perception. This study examined the impact of students’ perceptions of collaboration on their knowledge understanding and behavioral patterns in the CSCCM activity. A total of 36 individuals from the same university participated in this study. The findings suggested that compared with students with a low level of collaborative perception, students with a high level of collaborative perception could obtain better conceptual knowledge understanding. However, there was no significant difference in factual knowledge understanding between students with different levels of collaboration perception. For behavioral patterns, students with a high level of collaboration perception demonstrated more diverse behavioral transition sequences, students with a middle level of collaboration perception demonstrated more repetitive behavioral sequences, and students with a low level of collaboration perception demonstrated less behavioral transition sequences. The findings of this research can provide a reference for teachers to design CSCCM activities in the classroom.
期刊:
User Modeling and User-Adapted Interaction,2023年:1-33 ISSN:0924-1868
通讯作者:
Liang, RX
作者机构:
[Shen, Xiaoxuan; Yang, Zongkai; Liu, Sannyuya; Li, Qing; Liang, Ruxia; Du, Shangheng; Sun, Jianwen] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[Shen, Xiaoxuan; Yang, Zongkai; Liu, Sannyuya; Li, Qing; Liang, Ruxia; Du, Shangheng; Sun, Jianwen] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Yang, Zongkai; Liu, Sannyuya] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.
通讯机构:
[Liang, RX ] C;Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.
关键词:
Recommender systems;Group recommender systems;Adversarial learning;Knowledge transfer
摘要:
Many online services allow users to participate in various group activities such as online meeting or group buying and thus need to provide user groups with services that they are interested. The group recommender systems emerge as required and provide personalized services for various online user groups. Data sparsity is an important issue in group recommender systems, since even fewer group-item interactions are observed. Transfer learning has been one efficient tool to alleviate the data sparsity issue in recommender systems for individual users, but have not been utilized for group recommendation. Moreover, the group and the group members have complex and mutual relationships with each other, which exacerbates the difficulty in modelling the preferences of both a group and its members for recommendation. Therefore, group recommender systems face three main challenges that may significantly impact its quality and accuracy: (1) taking consideration of group member relationship and their interactions in modelling user and group preferences; (2) ensuring latent feature spaces between the users and groups are maximally matched; and (3) constructing a deep group recommendation method that both the individual user and group domains can benefit from a knowledge exchange. Hence, in this paper, we propose a deep adversarial group recommendation method, called DA-GR. User feature are separated into two subspaces to ensure only consistent group members’ feature knowledge can be extracted and shared with group preference modelling. Adversarial learning is used to effectively transfer consistent knowledge from individual user interactions to the group interaction domain through the bridge of group-user relationships. Extensive experiments, which demonstrate the effectiveness and superiority of our proposal, providing accurate recommendation for both individual users and groups, are conducted on public datasets. The source code of DA-GR is in
https://github.com/ccnu-mathits/DA-GR
.
期刊:
AI COMMUNICATIONS,2023年36(3):219-233 ISSN:0921-7126
通讯作者:
Liao, SB
作者机构:
[Liao, Shengbin; Liao, SB] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Wang, Xiaofeng; Yang, ZongKai] Cent China Normal Univ, Natl Engn Lab Educ Big Data Technol, Wuhan, Peoples R China.
通讯机构:
[Liao, SB ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.
关键词:
Human action recognition;mixed convolution;BN-Inception;two-stream network architecture
摘要:
The most widely used two-stream architectures and building blocks for human action recognition in videos generally consist of 2D or 3D convolution neural networks. 3D convolution can abstract motion messages between video frames, which is essential for video classification. 3D convolution neural networks usually obtain good performance compared with 2D cases, however it also increases computational cost. In this paper, we propose a heterogeneous two-stream architecture which incorporates two convolutional networks. One uses a mixed convolution network (MCN), which combines some 3D convolutions in the middle of 2D convolutions to train RGB frames, another one adopts BN-Inception network to train Optical Flow frames. Considering the redundancy of neighborhood video frames, we adopt a sparse sampling strategy to decrease the computational cost. Our architecture is trained and evaluated on the standard video actions benchmarks of HMDB51 and UCF101. Experimental results show our approach obtains the state-of-the-art performance on the datasets of HMDB51 (73.04%) and UCF101 (95.27%).
作者:
He, Xiuling;Fang, Jing;Cheng, Hercy N. H.;Men, Qibin;Li, Yangyang
期刊:
Education and Information Technologies,2023年28(9):11401-11422 ISSN:1360-2357
通讯作者:
Jing Fang
作者机构:
[He, Xiuling; Men, Qibin] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan, Hubei, Peoples R China.;[Fang, Jing; Li, Yangyang] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan, Hubei, Peoples R China.;[Cheng, Hercy N. H.] Taipei Med Univ, Ctr Gen Educ, Taipei City, Taiwan.
通讯机构:
[Jing Fang] N;National Engineering Research Center for E-learning, Central China Normal University, Wuhan City, China
期刊:
LIBRARY HI TECH,2023年41(4):1039-1062 ISSN:0737-8831
作者机构:
[Li, Yating] National Engineering Laboratory for Educational Big Data, Central China Normal University, Wuhan, China;[Zhou, Chi; Wu, Di; Chen, Min] National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China
关键词:
Teachers’ information literacy;Information literacy evaluation;Online information behavior;Online learning and teaching platform;Process evaluation;Supervised learning models
摘要:
Purpose
Advances in information technology now permit the recording of massive and diverse process data, thereby making data-driven evaluations possible. This study discusses whether teachers’ information literacy can be evaluated based on their online information behaviors on online learning and teaching platforms (OLTPs).
Design/methodology/approach
First, to evaluate teachers’ information literacy, the process data were combined from teachers on OLTP to describe nine third-level indicators from the richness, diversity, usefulness and timeliness analysis dimensions. Second, propensity score matching (PSM) and difference tests were used to analyze the differences between the performance groups with reduced selection bias. Third, to effectively predict the information literacy score of each teacher, four sets of input variables were used for prediction using supervised learning models.
Findings
The results show that the high-performance group performs better than the low-performance group in 6 indicators. In addition, information-based teaching and behavioral research data can best reflect the level of information literacy. In the future, greater in-depth explorations are needed with richer online information behavioral data and a more effective evaluation model to increase evaluation accuracy.
Originality/value
The evaluation based on online information behaviors has concrete application scenarios, positively correlated results and prediction interpretability. Therefore, information literacy evaluations based on behaviors have great potential and favorable prospects.
期刊:
Multimedia Tools and Applications,2023年82(9):14091-14105 ISSN:1380-7501
通讯作者:
Shixin Peng
作者机构:
[Peng, Shixin; Tan, Lei; Chen, Chang; Chen, Jingying] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.
通讯机构:
[Shixin Peng] N;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China
关键词:
Person re-identification;Cross modality;Channel decoupling
摘要:
Cross-modality person re-identification (CM-ReID) is a very challenging problem due to the discrepancy in data distributions between visible and near-infrared modalities. To obtain a robust sharing feature representation, existing methods mainly focus on image generation or feature constrain to decrease the modality discrepancy, which ignores the large gap between mixed-spectral visible images and single-spectral near-infrared images. In this paper, we address the problem by decoupling the mixed-spectral visible images into three single-spectral subspaces R, G, and B. By aligning the spectrum, we noted that even using a single spectral image instead of the VIS images could result in a better performance. Based on the above observation, we further introduce a clear and effective three-path channel decoupling network (CDNet) for combining the three spectral images. Extensive experiments implemented on the benchmark CM-ReID datasets, SYSU-MM01 and RegDB indicated that our method achieved state-of-the-art performance and outperformed existing approaches by a large margin. On the RegDB dataset, the absolute gain of our method in terms of rank-1 and mAP is well over 15.4% and 8.5%, respectively, compared with the state-of-the-art methods.
期刊:
Artificial Intelligence in Medicine,2023年145:102677 ISSN:0933-3657
通讯作者:
Jiang, XP
作者机构:
[Fu, Chengcheng] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Jiang, Xingpeng; Fu, Chengcheng; He, Tingting] Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.;[Fu, Chengcheng; van Harmelen, Frank; Huang, Zhisheng] Vrije Univ Amsterdam, Dept Comp Sci, Amsterdam, Netherlands.;[Fu, Chengcheng; He, Tingting; Jiang, Xingpeng] Cent China Normal Univ, Natl Language Resources Monitor Res Ctr Network Me, Wuhan, Peoples R China.;[Huang, Zhisheng] Tongji Univ, Sch Med, Clin Res Ctr Mental Disorders, Shanghai Pudong New Area Mental Hlth Ctr, Shanghai, Peoples R China.
通讯机构:
[Jiang, XP ] C;Cent China Normal Univ, Sch Comp Sci, Wuhan, Peoples R China.
关键词:
Food;Gut microbiota;Knowledge graph;Mental health
摘要:
In recent years, cross-modal hashing has attracted an increasing attention due to its fast retrieval speed and low storage requirements. However, labeled datasets are limited in real application, and existing unsupervised cross-modal hashing algorithms usually employ heuristic geometric prior as semantics, which introduces serious deviations as the similarity score from original features cannot reasonably represent the relationships among instances. In this paper, we study the unsupervised deep cross-modal hash retrieval method and propose a novel Semantic Graph Evolutionary Hashing (SGEH) to solve the above problem. The key novelty of SGEH is its evolutionary affinity graph construction method. To be concrete, we explore the sparse similarity graph with clustering results, which evolve from fusing the affinity information from code-driven graph on intrinsic data and subsequently extends to dense hybrid semantic graph which restricts the process of hash code learning to learn more discriminative results. Moreover, the batch-inputs are chosen from edge set rather than vertexes for better exploring the original spatial information in the sparse graph. Experiments on four benchmark datasets demonstrate the superiority of our framework over the state-of-the-art unsupervised cross-modal retrieval methods. Code is available at: https://github.com/theusernamealreadyexists/SGEH.
摘要:
We present a keyphrase extraction algorithm named TopicLPRank in this paper, which is an improved TopicRank algorithm. Different from the TopicRank which only uses the relative distance information of the text, we think that the length and absolute position of the text candidate keyphrases also have a certain influence on the results of the model for extraction keyphrases. Therefore, the proposed TopicLPRank incorporates these two factors on the basis of the TopicRank. The experimental results show that adding the location information and length information of candidate keyphrases can, respectively, increase the F-Score of the model by around 2.7
$$\%$$
points and 1.7
$$\%$$
points, which is equivalent to an increase of 19.6 and 12.3
$$\%$$
compared with the TopicRank. At the same time, the fusion of the length and location information of the candidate keyphrase can increase the F-Score by around 3.5 percentage points, which is equivalent to an increase of 25.21
$$\%$$
compared with the TopicRank in the dataset NUS.