摘要:
Interictal epileptiform discharges (IED) as large intermittent electrophysiological events are associated with various severe brain disorders. Automated IED detection has long been a challenging task, and mainstream methods largely focus on singling out IEDs from backgrounds from the perspective of waveform, leaving normal sharp transients/artifacts with similar waveforms almost unattended. An open issue still remains to accurately detect IED events that directly reflect the abnormalities in brain electrophysiological activities, minimizing the interference from irrelevant sharp transients with similar waveforms only. This study then proposes a dual-view learning framework (namely V2IED) to detect IED events from multi-channel EEG via aggregating features from the two phases: (1) Morphological Feature Learning: directly treating the EEG as a sequence with multiple channels, a 1D-CNN (Convolutional Neural Network) is applied to explicitly learning the deep morphological features; and (2) Spatial Feature Learning: viewing the EEG as a 3D tensor embedding channel topology, a CNN captures the spatial features at each sampling point followed by an LSTM (Long Short-Term Memories) to learn the evolution of these features. Experimental results from a public EEG dataset against the state-of-the-art counterparts indicate that: (1) compared with the existing optimal models, V2IED achieves a larger area under the receiver operating characteristic (ROC) curve in detecting IEDs from normal sharp transients with a 5.25% improvement in accuracy; (2) the introduction of spatial features improves performance by 2.4% in accuracy; and (3) V2IED also performs excellently in distinguishing IEDs from background signals especially benign variants.
作者机构:
[Liu, Sannyuya; Yuan, Xin; Yue, Jieyu; Li, Zhen; Li, Qing; Liu, SNYY; Hu, Tianhui; Chen, Sijing; Sun, Jianwen] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[Liu, Sannyuya; Liu, SNYY] Cent China Normal Univ, Natl Engn Res Ctr E Elearning, Wuhan 430079, Peoples R China.
通讯机构:
[Liu, SNYY ; Chen, SJ] C;Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Engn Res Ctr E Elearning, Wuhan 430079, Peoples R China.
摘要:
The purpose of this study was to investigate the frontier, science, and public engagement of educational science research. This paper conducted a systematic literature review of 101 educational science research articles published in Nature and Science in 1982-2021 based on the Web of Science database and analyzed the current status of research in terms of basic publication characteristics, research themes, and research processes. Five research topics were recognized, namely, education policy evaluation and reform, learning mechanisms and learning interventions, science education, educational technology, and education equity. Content of each topic had a distinctive emphasis. Findings revealed that most studies were dominated by empirical research, involving causal relationships between various educational phenomena, diverse range of research subjects, rigorous scientific randomized experiments, and quantitative analysis. We encourage more research on educational science in the future from four feasible directions, namely, developing active learning approaches to promoting effective learning, extending the research subjects and objectives of science education, conducting long-term, large-scale and practice-oriented research, and introducing new research methods into educational research.
作者:
Li, Jiayuan;Bai, Jie;Zhu, Sha;Yang, Harrison Hao
期刊:
Electronics,2024年13(2):385- ISSN:2079-9292
通讯作者:
Zhu, S;Yang, HH
作者机构:
[Zhu, Sha; Zhu, S; Bai, Jie; Li, Jiayuan] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Yang, HH; Yang, Harrison Hao] SUNY Coll Oswego, Sch Educ, Oswego, NY 13126 USA.
通讯机构:
[Yang, HH ] S;[Zhu, S ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;SUNY Coll Oswego, Sch Educ, Oswego, NY 13126 USA.
关键词:
digital literacy;digital game-based assessment;ECGD;AHP;assessment model
摘要:
This study measured secondary students' digital literacy using a digital game-based assessment system that was designed and developed based on the Evidence-Centered Game Design (ECGD) approach. A total of 188 secondary students constituted the valid cases in this study. Fine-grained behavioral data generated from students' gameplay processes were collected and recorded with the assessment system. The Delphi method was used to extract feature variables related to digital literacy from the process data, and the Analytic Hierarchy Process (AHP) method was used to construct the measurement model. The assessment results of the ECGD-based assessment had a high correlation with standardized test scores, which have been shown to be reliable and valid in prior large-scale assessment studies.
作者机构:
[Zhang, Miao] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Zhang, Miao; He, Tingting; Dong, Ming; Dong, M] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;[Zhang, Miao; He, Tingting; Dong, Ming; Dong, M] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Peoples R China.;[He, Tingting; Dong, Ming; Dong, M] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
通讯机构:
[He, TT; Dong, M ] C;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.
摘要:
Commonsense question answering (CQA) requires understanding and reasoning over QA context and related commonsense knowledge, such as a structured Knowledge Graph (KG). Existing studies combine language models and graph neural networks to model inference. However, traditional knowledge graph are mostly concept-based, ignoring direct path evidence necessary for accurate reasoning. In this paper, we propose MRGNN (Meta-path Reasoning Graph Neural Network), a novel model that comprehensively captures sequential semantic information from concepts and paths. In MRGNN, meta-paths are introduced as direct inference evidence and an original graph neural network is adopted to aggregate features from both concepts and paths simultaneously. We conduct sufficient experiments on the CommonsenceQA and OpenBookQA datasets, showing the effectiveness of MRGNN. Also, we conduct further ablation experiments and explain the reasoning behavior through the case study.
作者机构:
[Li, Duantengchuan; Li, Bing; Xia, Tao] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China.;[Wang, Jing] Chongqing Univ Posts & Telecommun, Sch Automat, Chongqing 400065, Peoples R China.;[Shi, Fobo] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Zhang, Qi; Zhang, Q] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China.;[Li, Bing] Hubei Luojia Lab, Wuhan 430079, Peoples R China.
通讯机构:
[Li, DTC; Li, B ] W;[Zhang, Q ] C;Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China.;Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China.;Hubei Luojia Lab, Wuhan 430079, Peoples R China.
关键词:
Link prediction;Knowledge graph embedding;Shallow interaction;Deep interaction;Attention mechanism;Vector tokenization
摘要:
Inferring missing information from current facts in a knowledge graph (KG) is the target of the link prediction task. Currently, existing methods embed the entities and relations of KG as a whole into a low-dimensional vector space. Nonetheless, they ignore the multi-level interactions (shallow interactions, deep interactions) among the finer-grained sub-features of entities and relations. To overcome these limitations, we present a shallow-to-deep feature interaction for knowledge graph embedding (SDFormer). It takes into account the interpretability of sub-feature tokens of entities and relations and learns shallow-to-deep interaction information between entities and relations at a more fine-grained level. Specifically, entity and relation vectors are decomposed into sub-features to represent multi-dimensional information. Then, a shallow-to-deep feature interaction method is designed to capture multi-level interactions between entities and relations. This process enriches the feature representation by modeling the interaction between sub-features. Finally, a 1-X scoring function is utilized to calculate the score of each knowledge triplet. The experimental results on several benchmark datasets show that SDFormer obtains competitive performance results and more efficient training efficiency on other comparative models and because of the shallow-to-deep feature interaction between entities and relations.
期刊:
Journal of King Saud University - Computer and Information Sciences,2024年36(1):101869 ISSN:1319-1578
通讯作者:
Wang, XG
作者机构:
[Wang, Xiaoguang; Zhao, Wanli; Wang, Shutong; Wang, XG] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China.;[Wang, Shutong] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Li, Duantengchuan] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China.;[Wang, Jing] Chongqing Univ Posts & Telecommun, Sch Automat, Chongqing 400065, Peoples R China.;[Wang, Xiaoguang; Wang, XG; Wang, Jing] Wuhan Univ, Intellectual Comp Lab Cultural Heritage, Wuhan 430072, Peoples R China.
通讯机构:
[Wang, XG ] W;Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China.;Wuhan Univ, Intellectual Comp Lab Cultural Heritage, Wuhan 430072, Peoples R China.
关键词:
Head pose estimation;Label distribution learning;Gaussian distribution;Asymmetric;Feature similarity
摘要:
Head pose estimation plays a pivotal role in various applications, including augmented reality and human–computer interaction within intelligent museum environments. Head pose estimation conventionally relies on hard labels. However, acquiring the “ground truth” through subjective means introduces an element of uncertainty into the labels for head pose estimation. The introduction of soft labels offers a potential remedy for this uncertainty. However, existing head pose estimation methods based on soft labels neglect the asymmetry of head pose. After careful observation, two types of asymmetry have been identified in human head pose: within angle and between angle asymmetry. Taking these two characteristics into account, we have devised a Double Asymmetric Distribution Learning (DADL) network model for the precise estimation of head pose angles. This model employs distinct soft label distribution mechanisms to capture within-angle and between-angle nuances in head pose variations. Thereby enhancing the interpretability, generalization capability, and classification accuracy of head pose estimation models. Extensive experiments were conducted on various widely recognized benchmarks, including the AFLW2000 and BIWI datasets. The results substantiate substantial advantages of our model over conventional approaches.
摘要:
Intelligent tutoring systems (ITS) have received much attention recently as online learning has taken off and is replacing offline instruction in many cases. It analyses user behavior and customizes personalized learning strategies for users through artificial intelligence technology. ITS encompasses a variety of entities and multiple relations, making it suitable to be represented as a graph. This perfectly aligns with the utilization of graph embedding (GE) for downstream ITS tasks. Existing GE methods cannot effectively model ITS data because the user evolution in ITS is discrete in time. The patterns of variation in user states are similar to each other but not correlated at the temporal level. Because of the hierarchical structure caused by the discrete evolution, encoding ITS data in a hyperbolic space is more sensible. We define a discrete evolution graph (DEG) to characterize ITS and propose a method called DEGE to embed it. The static nodes in a DEG are projected randomly and then transformed into hyperbolic space. Next, employ hyperbolic evolution networks to generate the embedding of dynamic nodes. The aggregated features of each node are then delivered by hyperbolic aggregation networks and are concatenated to generate the final higher-order features. To validate the superiority, design a multi-objective loss function with preserving pairwise proximity and preserving link types to train the model on several real datasets. The experimental results demonstrate that our method outperforms other baselines on both question annotation and performance prediction in ITS.
关键词:
Key-frame extraction;Audiovisual interaction and fusion;Attention mechanism;Emotion recognition;Intra-modality interaction;Cross-modality interaction
摘要:
Purpose - Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their emotional states in daily communication. Therefore, how to achieve automatic and accurate audiovisual emotion recognition is significantly important for developing engaging and empathetic human-computer interaction environment. However, two major challenges exist in the field of audiovisual emotion recognition: (1) how to effectively capture representations of each single modality and eliminate redundant features and (2) how to efficiently integrate information from these two modalities to generate discriminative representations.Design/methodology/approach - A novel key-frame extraction-based attention fusion network (KE-AFN) is proposed for audiovisual emotion recognition. KE-AFN attempts to integrate key-frame extraction with multimodal interaction and fusion to enhance audiovisual representations and reduce redundant computation, filling the research gaps of existing approaches. Specifically, the local maximum-based content analysis is designed to extract key-frames from videos for the purpose of eliminating data redundancy. Two modules, including "Multi-head Attention-based Intra-modality Interaction Module" and "Multi-head Attention-based Cross-modality Interaction Module", are proposed to mine and capture intra- and cross-modality interactions for further reducing data redundancy and producing more powerful multimodal representations.Findings - Extensive experiments on two benchmark datasets (i.e. RAVDESS and CMU-MOSEI) demonstrate the effectiveness and rationality of KE-AFN. Specifically, (1) KE-AFN is superior to state-of-the-art baselines for audiovisual emotion recognition. (2) Exploring the supplementary and complementary information of different modalities can provide more emotional clues for better emotion recognition. (3) The proposed key-frame extraction strategy can enhance the performance by more than 2.79 per cent on accuracy. (4) Both exploring intra- and cross-modality interactions and employing attention-based audiovisual fusion can lead to better prediction performance.Originality/value - The proposed KE-AFN can support the development of engaging and empathetic human-computer interaction environment.
期刊:
Journal of Science Education and Technology,2023年32(3):379-389 ISSN:1059-0145
通讯作者:
Xiao Yang
作者机构:
[Lu, Chun] Cent China Normal Univ, Minist Educ, Educ Informatizat Strategy Res Base, Wuhan, Hubei, Peoples R China.;[Yang, Wei; Yang, Xiao] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan, Hubei, Peoples R China.;[Wu, Longkai] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan, Hubei, Peoples R China.;[Lu, Chun; Wu, Longkai; Yang, Wei; Yang, Xiao] Cent China Normal Univ, Sci Hall, 152 Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Xiao Yang] N;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China<&wdkj&>Science Hall, Central China Normal University, Wuhan, China
摘要:
Understanding factors that influence k-12 students’ Science, Technology, Engineering, and Mathematics (STEM) performance is essential to improving their problem-solving ability. Most studies have focused on the relationship between students’ psychological factors and STEM performance and have paid little attention to the relationship between behavioral factors and STEM performance. This study explored the impact of behavioral factors (i.e., information and communications technology (ICT) readiness and online interaction (OI)) and psychological factors (i.e., internet self-efficacy (ISE)) on k-12 students’ STEM performance. The sample included 851 fifth graders and 535 eighth graders from cities in central China. The results of structural equation modeling analysis showed that ISE and ICT readiness (IR) significantly impacted the STEM performance of eighth graders. More importantly, ISE, a psychological factor, had the greatest effect on STEM performance and played a mediating role in the relationship between IR, OI, and STEM performance. These findings have important implications for STEM teachers. To improve students’ STEM performance, teachers should intervene to improve ISE according to students’ grades and cognitive ability, guide students to use ICT correctly, and encourage them to actively engage in OI.
摘要:
Decoding brain activity from non-invasive electroencephalography (EEG) is crucial for brain-computer interfaces (BCIs) and the study of brain disorders. Notably, end-to-end EEG decoding has gained widespread popularity in recent years owing to the remarkable advances in deep learning research. However, many EEG studies suffer from limited sample sizes, making it difficult for existing deep learning models to effectively generalize to highly noisy EEG data. To address this fundamental limitation, this paper proposes a novel end-to-end EEG decoding algorithm that utilizes a low-rank weight matrix to encode both spatio-temporal filters and the classifier, all optimized under a principled sparse Bayesian learning (SBL) framework. Importantly, this SBL framework also enables us to learn hyperparameters that optimally penalize the model in a Bayesian fashion. The proposed decoding algorithm is systematically benchmarked on five motor imagery BCI EEG datasets ( N=192) and an emotion recognition EEG dataset ( N=45), in comparison with several contemporary algorithms, including end-to-end deep-learning-based EEG decoding algorithms. The classification results demonstrate that our algorithm significantly outperforms the competing algorithms while yielding neurophysiologically meaningful spatio-temporal patterns. Our algorithm therefore advances the state-of-the-art by providing a novel EEG-tailored machine learning tool for decoding brain activity.
期刊:
Expert Systems with Applications,2023年214:118943 ISSN:0957-4174
通讯作者:
Chen, Zengzhao(zzchen@ccnu.edu.cn)
作者机构:
[Wang, Hu; Chen, Zengzhao; Li, Jiawen; Liu, Hai; Zheng, Qiuyu] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Li, Jiawen; Zheng, Qiuyu] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.;[Wang, Hu; Chen, Zengzhao; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Wang, Xuyang] Aviat Ind Corp, Luoyang Inst Electroopt Equipment, Luoyang 471023, Henan, Peoples R China.
通讯机构:
[Zengzhao Chen] F;Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China<&wdkj&>National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China
关键词:
Connection attention mechanism;Features fusion;Frame-level features;Speech emotion recognition;Utterance-level features
作者机构:
[Yang, Zongkai; Liu, Sannyuya; Liu, Zhi; Kong, Weizheng; Peng, Xian; Liu, Shiqi; Wen, Chaodong] Cent China Normal Univ, Fac Artificial Intelligence Educ, Natl Engn Res Ctr Educ Big Data, Wuhan, Peoples R China.;[Yang, Zongkai; Liu, Sannyuya] Cent China Normal Univ, Fac Artificial Intelligence Educ, Natl Engn Res Ctr Learning, Wuhan, Peoples R China.
通讯机构:
[Xian Peng; Zongkai Yang] N;National Engineering Research Center for Educational Big Data, Faculty of Artificial Intelligence in Education, Central China Normal University, PR China<&wdkj&>National Engineering Research Center for E-Learning, Faculty of Artificial Intelligence in Education, Central China Normal University, PR China<&wdkj&>National Engineering Research Center for Educational Big Data, Faculty of Artificial Intelligence in Education, Central China Normal University, PR China
关键词:
Cognitive engagement classification;Semi-supervised learning;Dual feature embedding;Linguistic Inquiry and Word Count (LIWC);Course discussion
摘要:
Online course discussions contain abundant cognitive information from learners. Previous models required a large amount of labeled data to classify cognitive engagement from the perspective of semantic features alone. However, these models only contain semantic features but cannot fully represent textual information and have poor performance in cases of scarce labeled data. Moreover, cognitive psychological features imply important information that cannot be captured by semantic features. Therefore, this paper proposes a dual feature embedding-based semi-supervised cognitive classification method that exploits the additional inductive biases caused by implicit cognitive features to supplement generic semantic features. Additional inductive biases facilitate the propagation of labeled and unlabeled data and improve the consistency between unlabeled and augmented data. Unsupervised data augmentation (UDA) is used to obtain augmented data by inserting advanced noise into unlabeled data in semi-supervised learning. Furthermore, bidirectional encoder representations from transformers (BERT) are used to extract generic semantics, and linguistic inquiry and word count (LIWC) are adopted to fetch implicit cognitive features from discussion texts. Therefore, we refer to the proposed method as B-LIWC-UDA, sequentially fusing the dual features in the explicit and hidden levels to obtain dual feature embeddings. The cognitive engagement classification model was trained using supervised and consistent training methods. We conducted experiments using datasets obtained from two real-world online course discussions. The experimental results demonstrate that, in terms of major evaluation metrics, the proposed B-LIWC-UDA method performs better than state-of-the-art text classification methods used for identifying cognitive engagement. (c) 2022 Elsevier B.V. All rights reserved.
摘要:
Many knowledge graphs, especially those that are collaboratively or automatically generated, are prone to noise and cross-domain entries, which can impede domain-specific applications. Existing methods for pruning inaccurate or out-of-domain information from knowledge graphs often rely on topological graph-pruning strategies. However, these approaches have two major drawbacks: they may discard logical structure and semantic information, and they allow multiple inheritance. To address these limitations, this study introduces KGPruning, which is a novel approach that can effectively clean and prune noisy knowledge graphs by guiding tasks with a given set of concepts and automatically generating a domain-specific taxonomy. Specifically, KGPruning employs a graph hierarchy inference method that is based on the Agony model to precisely identify and eliminate noisy entries while striving to preserve the underlying hierarchy of semantic relations as much as possible. Furthermore, to establish a tree-structured taxonomy, KGPruning integrates semantic relations and structural characteristics to effectively eliminate out-of-domain informa-tion and multiple inheritance. Through extensive experimental evaluations conducted on open benchmark datasets as well as large-scale real-world problems, the superior performance of KGPruning over state-of-the-art methods is demonstrated on the task of pruning noisy knowledge graphs.
期刊:
Journal of Science Education and Technology,2023年32(6):858-871 ISSN:1059-0145
通讯作者:
Zhao, L
作者机构:
[Sun, Chengzhang; Dai, Zhicheng] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Zhao, Liang; Zhu, Xiaoliang] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[Sun, Chengzhang; Dai, Zhicheng; Zhao, Liang; Zhu, Xiaoliang] Cent China Normal Univ, Fac Artificial Intelligence Educ, 152 Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.
通讯机构:
[Zhao, L ] C;Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Fac Artificial Intelligence Educ, 152 Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.
摘要:
Classroom interaction affects the classroom atmosphere as well as students' behavior and participation, thus affecting the quality of classroom teaching. In traditional classrooms, inherent problems (e.g., inflexible tables and chairs, rigid multimedia consoles, and traditional software) have seriously restricted the overall quality of classroom interpersonal interaction. In recent years, the problem of enhancing classroom interaction has gradually attracted the attention of scholars. The application of project-based learning (PBL) in higher education is effective, but few studies have analyzed the differences in interaction between smart classrooms and traditional classrooms in PBL courses. In this study, through the proposed teacher-student classroom interaction behavior analysis framework, 20 sessions in smart classrooms and 20 sessions in traditional classrooms were encoded to illustrate the differences between interaction in these two types of classrooms. Furthermore, 765 student questionnaires on satisfaction with and participation in smart classrooms were collected to determine whether smart classrooms affect students' satisfaction and participation in PBL courses. The questionnaires were analyzed using SPSS 27.0. The results showed that there were significant differences in four dimensions of teachers' behavior, students' behavior, technology, and other interactions between the smart classroom and the traditional classroom. After taking PBL courses in a smart classroom, students were generally satisfied and thought that the smart learning environment could help them improve their thinking and learning. Suggestions on the further construction and application of smart classrooms are proposed.
摘要:
Head pose estimation (HPE) is an indispensable upstream task in the fields of human-machine interaction, self-driving, and attention detection. However, practical head pose applications suffer from several challenges, such as severe occlusion, low illumination, and extreme orientations. To address these challenges, we identify three cues from head images, namely, critical minority relationships, neighborhood orientation relationships, and significant facial changes. On the basis of the three cues, two key insights on head poses are revealed: 1) intra-orientation relationship and 2) cross-orientation relationship. To leverage two key insights above, a novel relationship-driven method is proposed based on the Transformer architecture, in which facial and orientation relationships can be learned. Specifically, we design several orientation tokens to explicitly encode basic orientation regions. Besides, a novel token guide multi-loss function is accordingly designed to guide the orientation tokens as they learn the desired regional similarities and relationships. Experimental results on three challenging benchmark HPE datasets show that our proposed TokenHPE achieves state-of-the-art performance. Moreover, qualitative visualizations are provided to verify the effectiveness of the token-learning methodology.
作者机构:
[Chen, Zengzhao; Miao, Bingchen; Liu, Hai] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Chen, Zengzhao; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Chen, Zengzhao] Cent China Normal Univ, Natl Intelligent Soc Governance Expt Base Educ, Wuhan 430079, Peoples R China.;[Zhang, Aijun] China Telecom Corp Henan Branch, Zhengzhou 450016, Peoples R China.
通讯机构:
[Zhang, AJ ] C;China Telecom Corp Henan Branch, Zhengzhou 450016, Peoples R China.
期刊:
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,2023年31:826-834 ISSN:2329-9290
作者机构:
[Dong, Ming; Tu, Xinhui; Wang, Yufan; Mei, Jie; He, Tingting] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;[Dong, Ming; Tu, Xinhui; Wang, Yufan; Mei, Jie; He, Tingting] Cent China Normal Univ, Natl Language Resources Monitor & Res Ctr Network, Wuhan 430079, Peoples R China.;[Dong, Ming; Tu, Xinhui; Mei, Jie; He, Tingting] Cent China Normal Univ, Sch Comp Sci & Technol, Wuhan 430079, Peoples R China.;[Wang, Yufan] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
关键词:
Training;Correlation;Bit error rate;Semantics;Natural languages;Logic gates;Filling;Natural language processing;spoken language understanding;intent detection;slot filling
摘要:
Spoken language understanding (SLU) is an essential part of a task-oriented dialogue system, which mainly includes intent detection and slot filling. Some existing approaches obtain enhanced semantic representation by establishing the correlation between two tasks. However, those methods show little improvement when applied to BERT, since BERT has learned rich semantic features. In this paper, we propose a BERT-based model with the probability-aware gate mechanism, called PAGM (<underline>P</underline>robability <underline>A</underline>ware <underline>G</underline>ated <underline>M</underline>odel). PAGM aims to learn the correlation between intent and slot from the perspective of probability distribution, which explicitly utilizes intent information to guide slot filling. Besides, in order to efficiently incorporate BERT with the probability-aware gate, we design the stacked fine-tuning strategy. This approach introduces a mid-stage before target model training, which enables BERT to get better initialization for final training. Experiments show that PAGM achieves significant improvement on two benchmark datasets, and outperforms the previous state-of-the-art results.
作者机构:
[Zhu, Sha; Yang, Harrison Hao] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Guo, Qing] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Yang, Harrison Hao] SUNY Coll Oswego, Sch Educ, Oswego, NY 13126 USA.
通讯机构:
[Harrison Hao Yang] N;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China<&wdkj&>School of Education, State University of New York at Oswego, Oswego, NY 13126, USA<&wdkj&>Author to whom correspondence should be addressed.
关键词:
assessment methodologies;digital games;21st century skills;media in education
摘要:
Traditional methods of student assessment (SA) include self-reported surveys, standardized tests, etc. These methods are widely regarded by researchers as inducing test anxiety. They also ignore students' thinking processes and are not applicable to the assessment of higher-order skills. Digital game-based assessment (DGBA) is thought to address the shortcomings of traditional assessment methods. Given the advantages of DGBA, an increasing number of empirical studies are working to apply digital games for SA. However, there is a lack of any systematic review of DGBA studies. In particular, very little is known about the characteristics of the games, the content of the assessment, the methods of implementation, and the distribution of the results. This study examined the characteristics of DGBA studies, and the adopted games on SA in the past decade from different perspectives. A rigorous systematic review process was adopted in this study. First, the Web of Science (WOS) database was used to search the literature on DGBA published over the last decade. Then, 50 studies on SA were selected for subsequent analysis according to the inclusion and exclusion criteria. The results of this study found that DGBA has attracted the attention of researchers around the world. The participants of the DGBA studies were distributed across different educational levels, but the number of participants was small. Among all game genres, educational games were the most frequently used. Disciplinary knowledge is the most popular SA research content. Formative assessment modeling with process data and summative assessment using final scores were the most popular assessment methods. Correlation analysis was the most popular analysis method to verify the effectiveness of games on SA. However, many DGBA studies have reported unsatisfactory data analysis results. For the above findings, this study further discussed the reasons, as well as the meanings. In conclusion, this review showed the current status and gaps of DGBA in the SA application; directional references for future research of researchers and game designers are also provided.
摘要:
Real-time emotion recognition in conversations (ERC), which relies on only the historical utterances to achieve ERC, has recently gained increasing attention due to its significance in providing real-time empathetic services. Although utilizing multimodal information can mitigate the issues of unimodal approaches, few real-time ERC studies consider the differences in representation ability of different modalities and explore comprehensive conversational context from different perspectives based on different structures. Furthermore, the heavy annotation cost makes it difficult to collect sufficient labeled data, which also limits the performance of current supervised ERC approaches. To address these issues, we propose a novel framework SMFNM for real-time ERC, which integrates semi-supervised learning with multimodal fusion under the guidance of main-modal. Specifically, SMFNM utilizes additional unlabeled data to extract high-quality intra-modal representations, and implements cross-modal interaction to capture complementary information to enhance the audio representations. Then SMFNM employs the directed acyclic graph and the Gated Recurrent Units for exploring more accurate conversational context from both the multimodal and main-modal perspectives, respectively. Finally, these two types of contextual features are fused for emotion identification. Extensive experiments on benchmark datasets (i.e., IEMOCAP (4-way), IEMOCAP (6-way) and MELD) demonstrate the effectiveness, superiority and rationality of our SMFNM.(c) 2023 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).