作者:
Li, Jiayuan;Bai, Jie;Zhu, Sha;Yang, Harrison Hao
期刊:
Electronics,2024年13(2):385- ISSN:2079-9292
通讯作者:
Zhu, S;Yang, HH
作者机构:
[Zhu, Sha; Zhu, S; Bai, Jie; Li, Jiayuan] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Yang, HH; Yang, Harrison Hao] SUNY Coll Oswego, Sch Educ, Oswego, NY 13126 USA.
通讯机构:
[Yang, HH ] S;[Zhu, S ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;SUNY Coll Oswego, Sch Educ, Oswego, NY 13126 USA.
关键词:
digital literacy;digital game-based assessment;ECGD;AHP;assessment model
摘要:
This study measured secondary students' digital literacy using a digital game-based assessment system that was designed and developed based on the Evidence-Centered Game Design (ECGD) approach. A total of 188 secondary students constituted the valid cases in this study. Fine-grained behavioral data generated from students' gameplay processes were collected and recorded with the assessment system. The Delphi method was used to extract feature variables related to digital literacy from the process data, and the Analytic Hierarchy Process (AHP) method was used to construct the measurement model. The assessment results of the ECGD-based assessment had a high correlation with standardized test scores, which have been shown to be reliable and valid in prior large-scale assessment studies.
关键词:
Key-frame extraction;Audiovisual interaction and fusion;Attention mechanism;Emotion recognition;Intra-modality interaction;Cross-modality interaction
摘要:
Purpose - Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their emotional states in daily communication. Therefore, how to achieve automatic and accurate audiovisual emotion recognition is significantly important for developing engaging and empathetic human-computer interaction environment. However, two major challenges exist in the field of audiovisual emotion recognition: (1) how to effectively capture representations of each single modality and eliminate redundant features and (2) how to efficiently integrate information from these two modalities to generate discriminative representations.Design/methodology/approach - A novel key-frame extraction-based attention fusion network (KE-AFN) is proposed for audiovisual emotion recognition. KE-AFN attempts to integrate key-frame extraction with multimodal interaction and fusion to enhance audiovisual representations and reduce redundant computation, filling the research gaps of existing approaches. Specifically, the local maximum-based content analysis is designed to extract key-frames from videos for the purpose of eliminating data redundancy. Two modules, including "Multi-head Attention-based Intra-modality Interaction Module" and "Multi-head Attention-based Cross-modality Interaction Module", are proposed to mine and capture intra- and cross-modality interactions for further reducing data redundancy and producing more powerful multimodal representations.Findings - Extensive experiments on two benchmark datasets (i.e. RAVDESS and CMU-MOSEI) demonstrate the effectiveness and rationality of KE-AFN. Specifically, (1) KE-AFN is superior to state-of-the-art baselines for audiovisual emotion recognition. (2) Exploring the supplementary and complementary information of different modalities can provide more emotional clues for better emotion recognition. (3) The proposed key-frame extraction strategy can enhance the performance by more than 2.79 per cent on accuracy. (4) Both exploring intra- and cross-modality interactions and employing attention-based audiovisual fusion can lead to better prediction performance.Originality/value - The proposed KE-AFN can support the development of engaging and empathetic human-computer interaction environment.
摘要:
Interictal epileptiform discharges (IED) as large intermittent electrophysiological events are associated with various severe brain disorders. Automated IED detection has long been a challenging task, and mainstream methods largely focus on singling out IEDs from backgrounds from the perspective of waveform, leaving normal sharp transients/artifacts with similar waveforms almost unattended. An open issue still remains to accurately detect IED events that directly reflect the abnormalities in brain electrophysiological activities, minimizing the interference from irrelevant sharp transients with similar waveforms only. This study then proposes a dual-view learning framework (namely V2IED) to detect IED events from multi-channel EEG via aggregating features from the two phases: (1) Morphological Feature Learning: directly treating the EEG as a sequence with multiple channels, a 1D-CNN (Convolutional Neural Network) is applied to explicitly learning the deep morphological features; and (2) Spatial Feature Learning: viewing the EEG as a 3D tensor embedding channel topology, a CNN captures the spatial features at each sampling point followed by an LSTM (Long Short-Term Memories) to learn the evolution of these features. Experimental results from a public EEG dataset against the state-of-the-art counterparts indicate that: (1) compared with the existing optimal models, V2IED achieves a larger area under the receiver operating characteristic (ROC) curve in detecting IEDs from normal sharp transients with a 5.25% improvement in accuracy; (2) the introduction of spatial features improves performance by 2.4% in accuracy; and (3) V2IED also performs excellently in distinguishing IEDs from background signals especially benign variants.
作者机构:
[Liu, Sannyuya; Yuan, Xin; Yue, Jieyu; Li, Zhen; Li, Qing; Liu, SNYY; Hu, Tianhui; Chen, Sijing; Sun, Jianwen] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[Liu, Sannyuya; Liu, SNYY] Cent China Normal Univ, Natl Engn Res Ctr E Elearning, Wuhan 430079, Peoples R China.
通讯机构:
[Liu, SNYY ; Chen, SJ] C;Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Engn Res Ctr E Elearning, Wuhan 430079, Peoples R China.
摘要:
The purpose of this study was to investigate the frontier, science, and public engagement of educational science research. This paper conducted a systematic literature review of 101 educational science research articles published in Nature and Science in 1982-2021 based on the Web of Science database and analyzed the current status of research in terms of basic publication characteristics, research themes, and research processes. Five research topics were recognized, namely, education policy evaluation and reform, learning mechanisms and learning interventions, science education, educational technology, and education equity. Content of each topic had a distinctive emphasis. Findings revealed that most studies were dominated by empirical research, involving causal relationships between various educational phenomena, diverse range of research subjects, rigorous scientific randomized experiments, and quantitative analysis. We encourage more research on educational science in the future from four feasible directions, namely, developing active learning approaches to promoting effective learning, extending the research subjects and objectives of science education, conducting long-term, large-scale and practice-oriented research, and introducing new research methods into educational research.
作者机构:
[Xu, Ruyi; Chen, Jingying] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, 152 Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.;[Chen, Jingying; Han, Jiaxu] Cent China Normal Univ, Natl Engn Res Ctr Elearning, 152 Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.;[Chen, Jingying] Ningbo Yuxing Educ Technol Co Ltd, Ningbo 315200, Zhejiang, Peoples R China.
通讯机构:
[Jingying Chen] N;National Engineering Research Center of Educational Big Data, Central China Normal University, Wuhan, China<&wdkj&>National Engineering Research Center for E-learning, Central China Normal University, Wuhan, China<&wdkj&>Ningbo Yuxing Educational Technology Co., Ltd, Ningbo, China
期刊:
Education and Information Technologies,2024年:1-33 ISSN:1360-2357
通讯作者:
He, XL;Jiang, CL
作者机构:
[He, Xiuling; Fang, Jing; Li, Yangyang; Li, Yue; He, XL; Zhou, Ruijie] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[He, Xiuling; Fang, Jing; Li, Yangyang; Li, Yue; He, XL; Zhou, Ruijie] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Jiang, CL; Jiang, Chunlian] Univ Macau, Fac Educ, Macau, Peoples R China.
通讯机构:
[Jiang, CL ] U;[He, XL ] C;Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;Univ Macau, Fac Educ, Macau, Peoples R China.
摘要:
Computational thinking (CT), as one of the key skills in the twenty-first century, has been integrated into educational programming as an important learning goal. This study aims to explore CT processes involved in pair programming with the support of visual flow design. Thirty freshmen participated, working in pairs to solve two programming problems. Their discourses were recorded, transcribed, and coded based on a CT framework encompassing cognitive, practical, and social perspectives. Both quantitative and qualitative methods were applied to analyze the data. In particular, Epistemic Network Analysis (ENA) was applied to explore the patterns of their CT processes. The findings revealed that social perspectives emerged the most frequently in all pairs’ discourses. The high-level groups (HLGs) focused more on practical and social perspectives whereas the low-level groups (LLGs) emphasized more on cognitive perspectives. The ENA networks revealed that social perspectives mostly centered around cognitive perspectives for all pairs with CT process patterns in HLGs crossing the three perspectives more frequently. In addition, HLGs exhibited a more complicated and developmental trend in solving the two problems, while LLGs displayed a relatively similar CT pattern. The current study provides insights into the design and implementation of collaborative learning activities in educational programming.
期刊:
Education and Information Technologies,2024年:1-32 ISSN:1360-2357
通讯作者:
Du, X;Hung, JL
作者机构:
[Du, Xu; Li, Hao; Tang, Yeye; Xie, Yiqian] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Hung, Jui-Long; Hung, JL] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.;[Hung, Jui-Long; Hung, JL] Boise State Univ, Dept Educ Technol, 1910 Univ Dr, Boise, ID 83725 USA.;[Tang, Hengtao] Univ South Carolina, Dept Leadership Learning Design & Inquiry, Columbia, SC USA.
通讯机构:
[Hung, JL ; Du, X ] C;Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.;Boise State Univ, Dept Educ Technol, 1910 Univ Dr, Boise, ID 83725 USA.
摘要:
Collaborative problem-solving (CPS) involves the interaction and interdependence of students’ social and cognitive skills, making it a complex learning process. To delve into the complex dynamics of CPS, previous research has categorized socio-cognitive roles, providing insights into social-cognitive frameworks. However, despite the specific cognitive and social interaction structures employed by roles to engage in CPS interactions, most existing research primarily focuses on individual roles, neglecting inter-role interactions. To fill this gap, twelve triad groups were formed by engaging 36 undergraduate students in online CPS activities to examine differences in social and cognitive interaction structures across different roles and group compositions. Additionally, analyze the differences in CPS processes among various group compositions. The analyses identified five roles (Lurkers, Followers, Drivers, Influential Actors, and Innovators) and three group compositions (Balanced groups, Decentralized groups, and Power Struggle groups). The socio-cognitive structure of Balanced groups, along with other evidence, indicates effective information sharing and negotiation interactions. In contrast, Decentralized and Power Struggle groups exhibited various deficiencies in their socio-cognitive structures, negatively impacting group collaboration processes. These insights provide educators with a comprehensive guide to fostering effective group compositions and role dynamics in online CPS settings, thereby enhancing the overall success of CPS. Additionally, possible activity design considerations and scaffolding strategies are also discussed.
作者机构:
[Zhang, Miao] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Zhang, Miao; He, Tingting; Dong, Ming; Dong, M] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;[Zhang, Miao; He, Tingting; Dong, Ming; Dong, M] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Peoples R China.;[He, Tingting; Dong, Ming; Dong, M] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
通讯机构:
[He, TT; Dong, M ] C;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
摘要:
Commonsense question answering (CQA) requires understanding and reasoning over QA context and related commonsense knowledge, such as a structured Knowledge Graph (KG). Existing studies combine language models and graph neural networks to model inference. However, traditional knowledge graph are mostly concept-based, ignoring direct path evidence necessary for accurate reasoning. In this paper, we propose MRGNN (Meta-path Reasoning Graph Neural Network), a novel model that comprehensively captures sequential semantic information from concepts and paths. In MRGNN, meta-paths are introduced as direct inference evidence and an original graph neural network is adopted to aggregate features from both concepts and paths simultaneously. We conduct sufficient experiments on the CommonsenceQA and OpenBookQA datasets, showing the effectiveness of MRGNN. Also, we conduct further ablation experiments and explain the reasoning behavior through the case study.
期刊:
Journal of King Saud University - Computer and Information Sciences,2024年36(1):101869 ISSN:1319-1578
通讯作者:
Wang, XG
作者机构:
[Wang, Xiaoguang; Zhao, Wanli; Wang, Shutong; Wang, XG] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China.;[Wang, Shutong] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Li, Duantengchuan] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China.;[Wang, Jing] Chongqing Univ Posts & Telecommun, Sch Automat, Chongqing 400065, Peoples R China.;[Wang, Xiaoguang; Wang, XG; Wang, Jing] Wuhan Univ, Intellectual Comp Lab Cultural Heritage, Wuhan 430072, Peoples R China.
通讯机构:
[Wang, XG ] W;Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China.;Wuhan Univ, Intellectual Comp Lab Cultural Heritage, Wuhan 430072, Peoples R China.
关键词:
Head pose estimation;Label distribution learning;Gaussian distribution;Asymmetric;Feature similarity
摘要:
Head pose estimation plays a pivotal role in various applications, including augmented reality and human–computer interaction within intelligent museum environments. Head pose estimation conventionally relies on hard labels. However, acquiring the “ground truth” through subjective means introduces an element of uncertainty into the labels for head pose estimation. The introduction of soft labels offers a potential remedy for this uncertainty. However, existing head pose estimation methods based on soft labels neglect the asymmetry of head pose. After careful observation, two types of asymmetry have been identified in human head pose: within angle and between angle asymmetry. Taking these two characteristics into account, we have devised a Double Asymmetric Distribution Learning (DADL) network model for the precise estimation of head pose angles. This model employs distinct soft label distribution mechanisms to capture within-angle and between-angle nuances in head pose variations. Thereby enhancing the interpretability, generalization capability, and classification accuracy of head pose estimation models. Extensive experiments were conducted on various widely recognized benchmarks, including the AFLW2000 and BIWI datasets. The results substantiate substantial advantages of our model over conventional approaches.
作者机构:
[Li, Duantengchuan; Li, Bing; Xia, Tao] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China.;[Wang, Jing] Chongqing Univ Posts & Telecommun, Sch Automat, Chongqing 400065, Peoples R China.;[Shi, Fobo] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Zhang, Qi; Zhang, Q] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China.;[Li, Bing] Hubei Luojia Lab, Wuhan 430079, Peoples R China.
通讯机构:
[Li, DTC; Li, B ] W;[Zhang, Q ] C;Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China.;Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China.;Hubei Luojia Lab, Wuhan 430079, Peoples R China.
关键词:
Link prediction;Knowledge graph embedding;Shallow interaction;Deep interaction;Attention mechanism;Vector tokenization
摘要:
Inferring missing information from current facts in a knowledge graph (KG) is the target of the link prediction task. Currently, existing methods embed the entities and relations of KG as a whole into a low-dimensional vector space. Nonetheless, they ignore the multi-level interactions (shallow interactions, deep interactions) among the finer-grained sub-features of entities and relations. To overcome these limitations, we present a shallow-to-deep feature interaction for knowledge graph embedding (SDFormer). It takes into account the interpretability of sub-feature tokens of entities and relations and learns shallow-to-deep interaction information between entities and relations at a more fine-grained level. Specifically, entity and relation vectors are decomposed into sub-features to represent multi-dimensional information. Then, a shallow-to-deep feature interaction method is designed to capture multi-level interactions between entities and relations. This process enriches the feature representation by modeling the interaction between sub-features. Finally, a 1-X scoring function is utilized to calculate the score of each knowledge triplet. The experimental results on several benchmark datasets show that SDFormer obtains competitive performance results and more efficient training efficiency on other comparative models and because of the shallow-to-deep feature interaction between entities and relations.
作者:
Du, Xu;Zhang, Lizhao;Hung, Jui-Long;Li, Hao;Tang, Hengtao;...
期刊:
Journal of Computing in Higher Education,2024年36(1):29-56 ISSN:1042-1726
通讯作者:
Jui-Long Hung
作者机构:
[Dai, Miao; Li, Hao; Du, Xu] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Hung, Jui-Long; Zhang, Lizhao] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.;[Hung, Jui-Long] Boise State Univ, Dept Educ Technol, 1910 Univ Dr, Boise, ID 83725 USA.;[Tang, Hengtao] Univ South Carolina, Dept Educ Studies, Charleston, SC USA.
通讯机构:
[Jui-Long Hung] N;National Engineering Laboratory for Educational Big Data, Central China Normal University, Wuhan, China<&wdkj&>Department of Educational Technology, Boise State University, Boise, USA
摘要:
This study aims to track college students’ on-task rate during the teaching process and to analyze the influence of instructional strategies on on-task rate through the aspects of observable and internal engagement indicators. Thirty-six undergraduate students at a higher education institution in China participated in the study. Students’ behaviors and their EEG signals were recorded from fifty-one learning activities. Analyses have been focused on identifying the determinants of student’s engagement levels and revealing the impacts of behavioral sequences and cognitive sequences on student’s engagement levels. The results show that (1) instructional strategies, classroom behaviors, and cognitive states were significant predictors of students’ on-task rate; (2) the continuity of classroom behaviors improved the on-task rate; and (3) the standard deviations of attention and cognitive load were positively correlated with the on-task rate. This study describes a case of integrating multimodal data analysis in classroom teaching and discusses practical implications for improving classroom teaching.
作者机构:
[Hengtao Tang] Department of Leadership, Learning Design, and Inquiry, University of South Carolina, Columbia, USA;[Yeye Tang; Miao Dai; Xu Du; Hao Li] National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China;Department of Educational Technology, Boise State University, Boise, USA;National Engineering Laboratory for Educational Big Data, Central China Normal University, Wuhan, China;[Jui-Long Hung] Department of Educational Technology, Boise State University, Boise, USA<&wdkj&>National Engineering Laboratory for Educational Big Data, Central China Normal University, Wuhan, China
通讯机构:
[Hengtao Tang] D;Department of Leadership, Learning Design, and Inquiry, University of South Carolina, Columbia, USA
摘要:
Blended learning, integrating online and in-person components, has been increasingly adopted in higher education to enhance students’ learning experience and outcomes. While the advantages of blended learning are well-evidenced, research has primarily focused on the online pre-learning component, neglecting the significance of in-class activities. In-class activities play a crucial role in affording active learning opportunities (e.g., discussion, elaboration), necessitating a systemic understanding of their dynamics. The purpose of this study was thus to systemically investigate college students’ learning behaviors during in-class activities in a blended course. In-class activities were video-recorded and labelled manually following a coding scheme. By establishing a linear regression model, the study identified listening to the instructor’s lecture and taking notes as two predictors of students’ learning gains. Additionally, sequential patterns of learning behaviors during in-class activities were examined. The reciprocal interactions between students’ behavior of listening to the lecture and their note-taking actions were noted. The findings of this study contributed to a systemic view of blended learning by shedding light on students’ learning behaviors and their implications for instructional practice.
摘要:
Intelligent tutoring systems (ITS) have received much attention recently as online learning has taken off and is replacing offline instruction in many cases. It analyses user behavior and customizes personalized learning strategies for users through artificial intelligence technology. ITS encompasses a variety of entities and multiple relations, making it suitable to be represented as a graph. This perfectly aligns with the utilization of graph embedding (GE) for downstream ITS tasks. Existing GE methods cannot effectively model ITS data because the user evolution in ITS is discrete in time. The patterns of variation in user states are similar to each other but not correlated at the temporal level. Because of the hierarchical structure caused by the discrete evolution, encoding ITS data in a hyperbolic space is more sensible. We define a discrete evolution graph (DEG) to characterize ITS and propose a method called DEGE to embed it. The static nodes in a DEG are projected randomly and then transformed into hyperbolic space. Next, employ hyperbolic evolution networks to generate the embedding of dynamic nodes. The aggregated features of each node are then delivered by hyperbolic aggregation networks and are concatenated to generate the final higher-order features. To validate the superiority, design a multi-objective loss function with preserving pairwise proximity and preserving link types to train the model on several real datasets. The experimental results demonstrate that our method outperforms other baselines on both question annotation and performance prediction in ITS.
作者机构:
[Liu, Leyuan; Wang, Guangshuai; Liu, Lili; Zhang, Kun; Chen, Jingying] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[Liu, Leyuan; Wang, Guangshuai; Liu, Lili; Zhang, Kun; Chen, Jingying] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Liu, Leyuan; Yao, Xinyu; Wang, Guangshuai; Liu, Lili; Zhang, Kun; Chen, Jingying] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Ling, Yutao; Ling, YT] Cent China Normal Univ, Coll Phys Sci & Technol, Wuhan 430079, Peoples R China.
通讯机构:
[Ling, YT ; Wang, GS ] C;Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Coll Phys Sci & Technol, Wuhan 430079, Peoples R China.
关键词:
virtual reality;autism spectrum disorder;safety skills;skills training
摘要:
In recent years, virtual reality technology, which is able to simulate real-life environments, has been widely used in the field of intervention for individuals with autism and has demonstrated distinct advantages. This review aimed to evaluate the impact of virtual reality technology on safety skills intervention for individuals with autism. After searching and screening three databases, a total of 20 pertinent articles were included. There were six articles dedicated to the VR training of street-crossing skills for individuals with autism, nine articles focusing on the training of driving skills for individuals with ASD, and three studies examining the training of bus riding for individuals with ASD. Furthermore, there were two studies on the training of air travel skills for individuals with ASD. First, we found that training in some complex skills (e.g., driving skills) should be selected for older, high-functioning individuals with ASD, to determine their capacity to participate in the training using scales or questionnaires before the intervention; VR devices with higher levels of immersion are not suitable for younger individuals with ASD. Second, VR is effective in training safety skills for ASD, but there is not enough evidence to determine the relationship between the level of VR immersion and intervention effects. Although the degree of virtual reality involvement has an impact on the ability of ASD to be generalized to the real world, it is important to ensure that future virtual reality settings are realistic and lifelike. Again, adaptive models that provide personalized training to individuals with ASD in VR environments are very promising, and future research should continue in this direction. This paper also discusses the limitations of these studies, as well as potential future research directions.
作者机构:
[Zhuo Wang; Wenkai Huang] Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China;National Engineering Research Center of Big Data, Center China Normal University, Wuhan 430079, China;Author to whom correspondence should be addressed.;[Yuqun Wen] Faculty of Literature and Journalism, Xiangtan University, Xiangtan 411105, China
通讯机构:
[Shengming Wang] N;National Engineering Research Center of Big Data, Center China Normal University, Wuhan 430079, China<&wdkj&>Author to whom correspondence should be addressed.
摘要:
Teaching gesture recognition is a technique used to recognize the hand movements of teachers in classroom teaching scenarios. This technology is widely used in education, including for classroom teaching evaluation, enhancing online teaching, and assisting special education. However, current research on gesture recognition in teaching mainly focuses on detecting the static gestures of individual students and analyzing their classroom behavior. To analyze the teacher’s gestures and mitigate the difficulty of single-target dynamic gesture recognition in multi-person teaching scenarios, this paper proposes skeleton-based teaching gesture recognition (ST-TGR), which learns through spatio-temporal representation. This method mainly uses the human pose estimation technique RTMPose to extract the coordinates of the keypoints of the teacher’s skeleton and then inputs the recognized sequence of the teacher’s skeleton into the MoGRU action recognition network for classifying gesture actions. The MoGRU action recognition module mainly learns the spatio-temporal representation of target actions by stacking a multi-scale bidirectional gated recurrent unit (BiGRU) and using improved attention mechanism modules. To validate the generalization of the action recognition network model, we conducted comparative experiments on datasets including NTU RGB+D 60, UT-Kinect Action3D, SBU Kinect Interaction, and Florence 3D. The results indicate that, compared with most existing baseline models, the model proposed in this article exhibits better performance in recognition accuracy and speed.
期刊:
Information Sciences,2024年666:120438 ISSN:0020-0255
通讯作者:
Bing Li
作者机构:
[Duantengchuan Li; Chao Zheng] School of Computer Science, Wuhan University, Wuhan 430072, China;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China;[Xiaoguang Wang] School of Information Management, Wuhan University, Wuhan 430072, China;[Yuefeng Cai] ZTE Corporation, Shenzhen, 518057, China;Hubei Luojia Laboratory, Wuhan 430079, China
通讯机构:
[Bing Li] S;School of Computer Science, Wuhan University, Wuhan 430072, China<&wdkj&>Hubei Luojia Laboratory, Wuhan 430079, China
摘要:
Knowledge graphs are multi-relation heterogeneous graphs. Thus, the existence of numerous multi-relation entities imposes a tough challenge to the modelling of the knowledge graph. Some recent works represent the property of corresponding entities and relations by generating embeddings. They attempted to identify the missing entities by translation operations or semantic matching. However, the expressiveness of these approaches depends on the entity (relations) embedding. The heterogeneity of entities leads to the difficulty of balancing uniform embedding dimension settings on complex and sparse relational entities, as high-dimensional embedding leads to the overfitting of sparse relational entities, and low-dimensional embedding leads to the underfitting of complex relational entities. We introduce a multi-perspective knowledge graph embedding model with global and interaction features (MGIF) to alleviate these issues. This achieved knowledge transfer from complex relational entities to sparse relational entities through the multi-view features. In particular, to overcome the local limitations of convolution neural networks, the global features shared between entities (relations) and entities (relations) are incorporated in the MGIF. The performance of MGIF is experimentally evaluated on several datasets. The experimental effects demonstrate that MGIF can efficiently model complicated entities and accomplish state-of-the-art complex relationship prediction results on most evaluation metrics.