期刊:
Current Psychology,2023年42(27):23687-23697 ISSN:1046-1310
通讯作者:
Zhongling Pi
作者机构:
[Yang, Jiumin; Liu, Caixia; Zhang, Yi] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Wu, Changcheng] Cent China Normal Univ, Natl Engn Res Ctr Learning, Artificial Intelligence Educ Div, Wuhan 430079, Peoples R China.;[Wu, Changcheng] Sichuan Normal Univ, Coll Comp Sci, Chengdu 610101, Peoples R China.;[Pi, Zhongling] Shaanxi Normal Univ, Minist Educ, Key Lab Modern Teaching Technol, 199 South Changan Rd, Xian 710062, Shaanxi, Peoples R China.
通讯机构:
[Pi, Z.] K;Key Laboratory of Modern Teaching Technology (Ministry of Education), Shaanxi Normal University, No. 199 South Chang’an Road, Yanta District, Shaanxi Province, Xi’an, China
作者机构:
[Ullah, Anwar; Yu, Xinguo; Yu, XG] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Numan, Muhammad] Cent China Normal Univ, Wollongong Joint Inst, Wuhan 430079, Peoples R China.
通讯机构:
[Yu, XG ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
摘要:
Generating realistic and synthetic video from text is a highly challenging task due to the multitude of issues involved, including digit deformation, noise interference between frames, blurred output, and the need for temporal coherence across frames. In this paper, we propose a novel approach for generating coherent videos of moving digits from textual input using a Deep Deconvolutional Generative Adversarial Network (DD-GAN). The DD-GAN comprises a Deep Deconvolutional Neural Network (DDNN) as a Generator (G) and a modified Deep Convolutional Neural Network (DCNN) as a Discriminator (D) to ensure temporal coherence between adjacent frames. The proposed research involves several steps. First, the input text is fed into a Long Short Term Memory (LSTM) based text encoder and then smoothed using Conditioning Augmentation (CA) techniques to enhance the effectiveness of the Generator (G). Next, using a DDNN to generate video frames by incorporating enhanced text and random noise and modifying a DCNN to act as a Discriminator (D), effectively distinguishing between generated and real videos. This research evaluates the quality of the generated videos using standard metrics like Inception Score (IS), Frechet Inception Distance (FID), Frechet Inception Distance for video (FID2vid), and Generative Adversarial Metric (GAM), along with a human study based on realism, coherence, and relevance. By conducting experiments on Single-Digit Bouncing MNIST GIFs (SBMG), Two-Digit Bouncing MNIST GIFs (TBMG), and a custom dataset of essential mathematics videos with related text, this research demonstrates significant improvements in both metrics and human study results, confirming the effectiveness of DD-GAN. This research also took the exciting challenge of generating preschool math videos from text, handling complex structures, digits, and symbols, and achieving successful results. The proposed research demonstrates promising results for generating coherent videos from textual input.
期刊:
Neural Computing and Applications,2023年35(11):8343-8356 ISSN:0941-0643
通讯作者:
Baolin Yi
作者机构:
[Shen, Xiaoxuan; Wang, Wei; Zhang, Huanyu; Li, Zhifei; Yi, Baolin] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.
通讯机构:
[Baolin Yi] N;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China
关键词:
Link prediction;Knowledge graph embedding;Convolution neural network;Feature interaction;Complex relations
摘要:
Most knowledge graphs(KGs) are large and incomplete graph-structure database, which can be completed by predicting miss links according to the existing knowledge. The mainstream method is knowledge graph embedding (KGE) which is designed to learn low dimensional embedding of entities and relations. However, knowledge graph embedding still faces two major issues: (1) How to generate more expressive embeddings? (2) How to solve semantic polysemy of entities in different relations? In this paper, we propose a novel KG embedding model, RIECN (Relation-based Interactive Embedding Convolutional Network), which achieves high-quality performance and shows some advancements in modeling complex relations. In RIECN, FIR (Feature Interaction Reshaping) method is introduced to increase the feature interactions between entity and relation embeddings to generate more expressive feature maps. In addition, a new method of generating relation-based dynamic convolution filters, RDCF, is proposed. RDCF generates specific relation and hybird-size convolution filters, which enriches the feature maps of each entity improving the accuracy of link prediction task especially in complex relations scenario. We tested the performance of our model on five benchmark datasets. The experimental results show that the RIECN model significantly outperforms recent state-of-the-art models by 0.1–3.2% and 1.1–3.7%, in terms of MMR metric and Hit@1 metric, respectively.
期刊:
Journal of King Saud University - Computer and Information Sciences,2023年35(7):101605 ISSN:1319-1578
通讯作者:
Zhang, Q
作者机构:
[Wang, Xiaoguang; Wang, Shutong] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China.;[Wang, Shutong] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Zhao, Anran] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China.;[Lai, Chenghang] Fudan Univ, Sch Comp Sci, Shanghai 200438, Peoples R China.;[Zhang, Qi; Zhang, Q] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China.
通讯机构:
[Zhang, Q ] C;Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China.
关键词:
Facial expression recognition;Graph convolutional network;Geometry cue;Uncertainty;Emotion label distribution learning
摘要:
Facial expression recognition (FER) task in the wild is challenging due to some uncertainties, such as the ambiguity of facial expressions, subjective annotations, and low-quality facial images. A novel model for FER in-the-wild datasets is proposed in this study to solve these uncertainties. The overview of the proposed method is as follows. First, the facial images are grouped into high and low uncertainties by the pre-trained network. The graph convolutional network (GCN) framework is then used for the facial images with low uncertainty to obtain geometry cues, including the relationship among action units (AUs) and the implicit connection between AUs and expressions, which help predict the probability of the underlying emotional label. The emotion label distribution is produced by combining the predicted latent label probability and the given label. For the facial images with high uncertainty, k-nearest neighbor graphs are built to determine the k facial images in the low uncertainty group with the highest similarity to the given facial image. The emotion label distribution of the given image is then replaced by fusing the emotion label distribution based on the distances between the given image and its adjacent images. Finally, the constructed emotion label distribution facilitates training in a straightforward manner using a convolutional neural network framework to identify facial expressions. Experimental results on RAF-DB, FERPlus, AffectNet, and SFEW2.0 datasets demonstrate that the proposed method achieved superior performance compared to state-of-the-art approaches. (c) 2023 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
作者机构:
[Yu, Shengquan] Beijing Normal Univ, Adv Innovat Ctr Future Educ, Beijing, Peoples R China;[Zhang, Lishan] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Hubei, Peoples R China;[Huang, Yuwei; Yang, Xi] 17Zuoye, Beijing, Peoples R China;[Zhuang, Fuzhen] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
通讯机构:
[Yu, Shengquan] B;Beijing Normal Univ, Adv Innovat Ctr Future Educ, Beijing, Peoples R China.
摘要:
Automatic short-answer grading has been studied for more than a decade. The technique has been used for implementing auto assessment as well as building the assessor module for intelligent tutoring systems. Many early works automatically grade mainly based on the similarity between a student answer and the reference answer to the question. This method performs well for closed-ended questions that have single or very limited numbers of correct answers. However, some short-answer questions ask students to express their own thoughts based on various facts; hence, they have no reference answers. Such questions are called semi-open-ended short-answer questions. Questions of this type often appear in reading comprehension assessments. In this paper, we developed an automatic semi-open-ended short-answer grading model that integrates both domain-general and domain-specific information. The model also utilizes a long-short-term-memory recurrent neural network to learn the representation in the classifier so that word sequence information is considered. In experiments on 7 reading comprehension questions and over 16,000 short-answer samples, our proposed automatic grading model demonstrates its advantage over existing models.
摘要:
Dialogue state tracking (DST) is a core component of task-oriented dialogue systems. Recent works focus mainly on end-to-end DST models that omit the spoken language understanding (SLU) module to directly obtain the dialogue state based on a user’s dialogue. However, the slot information detected by slot filling in SLU is closely tied to the slot–value pair that needs to be updated in DST. Efficient use of the key slot semantic knowledge obtained by slot filling contributes to improving the performance of DST. Based on this idea, we introduce slot filling as a subtask and build an end-to-end joint model to explicitly integrate the slot information detected by slot filling, which further guides DST. In this article, a novel stack-propagation framework with slot filling for multidomain DST is proposed. The stack-propagation framework is introduced to jointly model slot filling and DST. The framework directly feeds the key slot semantic knowledge detected by slot filling into the DST module. In addition, a slot-masked attention mechanism is designed to enable DST to focus on the key slot information obtained by slot filling. When the slot value is updated, a slot–value softcopy mechanism is designed to enhance the influence of the words marked by key slots. Experiments show that our approach outperforms previous methods and performs outstandingly on two benchmark datasets. IEEE
作者机构:
[Yang, Zongkai; Liu, Sannyuya; Liu, Zhi; Peng, Xian] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan, Peoples R China.;[Yang, Zongkai; Liu, Sannyuya; Mu, Rui] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.;[Chen, Jia] Cent China Normal Univ, Sch Educ Informat Technol, Wuhan, Peoples R China.
通讯机构:
[Xian Peng] N;National Engineering Laboratory for Educational Big Data, Central China Normal University, Wuhan, People’s Republic of China
期刊:
IEEE Transactions on Multimedia,2022年24:2449-2460 ISSN:1520-9210
通讯作者:
Fang, S.
作者机构:
[Li, Duantengchuan; Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Fang, Shuai] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.;[Lin, Ke] Harbin Inst Technol, Control Sci & Engn, Shenzhen 150001, Peoples R China.;[Wang, Jiazhang] Northwestern Univ, Evanston, IL 60208 USA.
通讯机构:
[Fang, S.] C;Central China Normal University, National Engineering Laboratory For Educational Big Data, Wuhan, China
作者机构:
[Yang, Bing; Liu, Tingting] Hubei Univ, Sch Educ, 368 Youyi Rd, Wuhan 430062, Hubei, Peoples R China.;[Subramanian, Sriram; Liu, Tingting; Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Ju, Jianping] Hubei Business Coll, Sch Artificial Intelligence, Wuhan 430079, Peoples R China.;[Tang, Jianyin] Changchun Univ Sci & Technol, Sch Electromech Engn, Changchun 130022, Peoples R China.;[Liu, Hai] UCL, UCL Interact Ctr, London, England.
通讯机构:
[Ju, JP ] H;[Liu, H ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;Hubei Business Coll, Sch Artificial Intelligence, Wuhan 430070, Peoples R China.
作者机构:
[Tang, Hengtao] Univ South Carolina, Dept Educ Studies, Columbia, SC 29208 USA.;[Dai, Miao; Yang, Shuoqiu; Li, Hao; Du, Xu] Cent China Normal Univ, Natl Engn Res Ctr Learning, Wuhan, Peoples R China.;[Hung, Jui-Long] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan, Peoples R China.;[Hung, Jui-Long] Boise State Univ, Dept Educ Technol, Boise, ID 83725 USA.
通讯机构:
[Hengtao Tang] D;Department of Educational Studies, University of South Carolina, Columbia, United States
关键词:
collaborative problem-solving (CPS);attention;multimodal learning analytics;online;hidden Markov model (HMM)
摘要:
早期预警是在线学习中的重要主题,通过早期预警识别有不及格风险的学生可帮助教师及时开展个性化教学干预。使用深度学习模型对学生微观行为模式进行分析以提高早期预警的效果,并提出结合LSTM-autoencoder特征处理和注意力权重计算的不及格风险学生早期预警模型(LSTM-autoencoder and attention based early warning model,LAA)。该方法通过LSTM-autoencoder对学生行为时间序列数据进行特征处理,采用注意力机制计算关键预测因子。实验结果表明,LAA比基线模型取得更高的召回率,对低交互型和非持续型学生具有更好的识别效果,且能将教学干预时间提前;此外,该方法可识别影响成绩的关键周次和行为,可用于辅助教师开展在线教学指导。