国家数字化学习工程技术研究中心

首页 > 院系 > 详情

National Engineering Research Center for E-Learning

国家数字化学习工程技术研究中心(National Engineering Research Center for E-Learning, NERCEL)依托华中师范大学组建，是国内从事教育信息化技术研究和科研成果转化的专门研发机构，于2004年经湖北省发展和改...

发文量

1416

高被引

2

SCI-E

263

SSCI

130

A&HCI

1

CPCI-S

243

EI

190

Medline

11

CSCD

84

CSSCI

216

院系成果

院系学者

院系分析

成果类型

请选择成果类型

全部

期刊论文

会议论文

筛选

开始检索

已无可筛选条件

成果类型

期刊论文

263

会议论文

年份 (2011~2024)

年

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

语种

英文

262

中文

期刊

Infrared Physics & Technology

IEEE ACCESS

Neurocomputing

Expert Systems with Applications

IEEE Transactions on Industrial Informatics

Multimedia Tools and Applications

International Journal of Pattern Recognition and Artificial Intelligence

Knowledge-Based Systems

Sensors

Applied Optics

Journal of King Saud University - Computer and Information Sciences

Behaviour & Information Technology

Computers & Education

IEEE Transactions on Neural Networks and Learning Systems

Information Sciences

Sustainability

Wireless Personal Communications

Applied Sciences-Basel

Applied Soft Computing

Computing

作者

Zhang, Zhaoli

Liu, Hai

Chen, Jingying

Liu, Sanya

Liu, Hai

He, Tingting

Chen, Zengzhao

Liu, Tingting

Yang, Zongkai

Du, Xu

Liu, Sannyuya

Sun, Jianwen

Liu, Tingting

Liu, Yanshen

Shen, Xiaoxuan

Shu, Jiangbo

Xi, Jiangtao

Zhao, Liang

Chen, Jingying

Chen, Dan

关键词

Head pose estimation

Convolutional neural network

Facial expression recognition

Deep learning

Regularization

Task analysis

Training

deep learning

Blind deconvolution

Feature extraction

Knowledge graph embedding

Semantics

Attention mechanism

Bayesian inference

Infrared imaging

Infrared spectroscopy

Inverse problems

Learning behavior analysis

Link prediction

RFID

机构署名

本校为第一机构

203

本校为通讯机构

151

本校为第一且通讯机构

140

本校为其他机构

院系归属

国家数字化学习工程技术研究中心

263

计算机学院

教育信息技术学院

信息管理学院

城市与环境科学学院

心理学院

公共管理学院

新闻传播学院

伍伦贡联合研究院

排序：

时间

1/14

每页显示条

请选择

共263条记录，

V2IED: Dual-view learning framework for detecting events of interictal epileptiform discharges

作者： Ming, Zhekai;Chen, Dan;Gao, Tengfei;Tang, Yunbo;Tu, Weiping;...

期刊： Neural Networks,2024年172:106136 ISSN：0893-6080

通讯作者： Chen, D

作者机构： [Chen, Dan; Gao, Tengfei; Chen, D; Ming, Zhekai; Tu, Weiping] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan 430072, Peoples R China.;[Tang, Yunbo] Fuzhou Univ, Coll Comp & Data Sci, Fuzhou 350108, Peoples R China.;[Chen, Jingying] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.

通讯机构： [Chen, D ] W;Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan 430072, Peoples R China.

关键词： Convolutional neural network;Dual-view learning;Electroencephalography;Interictal epileptiform discharge;Long short-term memories

摘要： Interictal epileptiform discharges (IED) as large intermittent electrophysiological events are associated with various severe brain disorders. Automated IED detection has long been a challenging task, and mainstream methods largely focus on singling out IEDs from backgrounds from the perspective of waveform, leaving normal sharp transients/artifacts with similar waveforms almost unattended. An open issue still remains to accurately detect IED events that directly reflect the abnormalities in brain electrophysiological activities, minimizing the interference from irrelevant sharp transients with similar waveforms only. This study then proposes a dual-view learning framework (namely V2IED) to detect IED events from multi-channel EEG via aggregating features from the two phases: (1) Morphological Feature Learning: directly treating the EEG as a sequence with multiple channels, a 1D-CNN (Convolutional Neural Network) is applied to explicitly learning the deep morphological features; and (2) Spatial Feature Learning: viewing the EEG as a 3D tensor embedding channel topology, a CNN captures the spatial features at each sampling point followed by an LSTM (Long Short-Term Memories) to learn the evolution of these features. Experimental results from a public EEG dataset against the state-of-the-art counterparts indicate that: (1) compared with the existing optimal models, V2IED achieves a larger area under the receiver operating characteristic (ROC) curve in detecting IEDs from normal sharp transients with a 5.25% improvement in accuracy; (2) the introduction of spatial features improves performance by 2.4% in accuracy; and (3) V2IED also performs excellently in distinguishing IEDs from background signals especially benign variants.

语种：英文

展开

导出

原文链接

认领

Frontier Development and Insights of International Educational Science Research in the journals Nature and Science: a Systematic Literature Review over 40 Years

作者： Li, Qing;Yue, Jieyu;Sun, Jianwen;Chen, Sijing*;Liu, Sannyuya;...

期刊： Science & Education,2024年:1-29 ISSN：0926-7220

通讯作者： Chen, Sijing;Liu, SNYY

作者机构： [Liu, Sannyuya; Yuan, Xin; Yue, Jieyu; Li, Zhen; Li, Qing; Liu, SNYY; Hu, Tianhui; Chen, Sijing; Sun, Jianwen] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[Liu, Sannyuya; Liu, SNYY] Cent China Normal Univ, Natl Engn Res Ctr E Elearning, Wuhan 430079, Peoples R China.

通讯机构： [Liu, SNYY ; Chen, SJ] C;Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Engn Res Ctr E Elearning, Wuhan 430079, Peoples R China.

关键词： Educational science;Education policy;Learning mechanism;Educational technology;Research paradigm

摘要： The purpose of this study was to investigate the frontier, science, and public engagement of educational science research. This paper conducted a systematic literature review of 101 educational science research articles published in Nature and Science in 1982-2021 based on the Web of Science database and analyzed the current status of research in terms of basic publication characteristics, research themes, and research processes. Five research topics were recognized, namely, education policy evaluation and reform, learning mechanisms and learning interventions, science education, educational technology, and education equity. Content of each topic had a distinctive emphasis. Findings revealed that most studies were dominated by empirical research, involving causal relationships between various educational phenomena, diverse range of research subjects, rigorous scientific randomized experiments, and quantitative analysis. We encourage more research on educational science in the future from four feasible directions, namely, developing active learning approaches to promoting effective learning, extending the research subjects and objectives of science education, conducting long-term, large-scale and practice-oriented research, and introducing new research methods into educational research.

语种：英文

展开

导出

原文链接

认领

Game-Based Assessment of Students’ Digital Literacy Using Evidence-Centered Game Design

作者： Li, Jiayuan;Bai, Jie;Zhu, Sha;Yang, Harrison Hao

期刊： Electronics,2024年13(2):385- ISSN：2079-9292

通讯作者： Zhu, S;Yang, HH

作者机构： [Zhu, Sha; Zhu, S; Bai, Jie; Li, Jiayuan] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Yang, HH; Yang, Harrison Hao] SUNY Coll Oswego, Sch Educ, Oswego, NY 13126 USA.

通讯机构： [Yang, HH ] S;[Zhu, S ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;SUNY Coll Oswego, Sch Educ, Oswego, NY 13126 USA.

关键词： digital literacy;digital game-based assessment;ECGD;AHP;assessment model

摘要： This study measured secondary students' digital literacy using a digital game-based assessment system that was designed and developed based on the Evidence-Centered Game Design (ECGD) approach. A total of 188 secondary students constituted the valid cases in this study. Fine-grained behavioral data generated from students' gameplay processes were collected and recorded with the assessment system. The Delphi method was used to extract feature variables related to digital literacy from the process data, and the Analytic Hierarchy Process (AHP) method was used to construct the measurement model. The assessment results of the ECGD-based assessment had a high correlation with standardized test scores, which have been shown to be reliable and valid in prior large-scale assessment studies.

语种：英文

展开

导出

原文链接

认领

Meta-path reasoning of knowledge graph for commonsense question answering

作者： Zhang, Miao;He, Tingting*;Dong, Ming

期刊： 计算机科学前沿(英文),2024年18(1):181303-null ISSN：2095-2228

通讯作者： He, Tingting;Dong, M

作者机构： [Zhang, Miao] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Zhang, Miao; He, Tingting; Dong, Ming; Dong, M] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;[Zhang, Miao; He, Tingting; Dong, Ming; Dong, M] Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Peoples R China.;[He, Tingting; Dong, Ming; Dong, M] Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.

通讯机构： [He, TT; Dong, M ] C;Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Natl Language Resources Monitoring & Res Ctr Netwo, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Sch Comp, Wuhan 430079, Peoples R China.

关键词： question answering;knowledge graph;graph neural network;meta-path reasoning

摘要： Commonsense question answering (CQA) requires understanding and reasoning over QA context and related commonsense knowledge, such as a structured Knowledge Graph (KG). Existing studies combine language models and graph neural networks to model inference. However, traditional knowledge graph are mostly concept-based, ignoring direct path evidence necessary for accurate reasoning. In this paper, we propose MRGNN (Meta-path Reasoning Graph Neural Network), a novel model that comprehensively captures sequential semantic information from concepts and paths. In MRGNN, meta-paths are introduced as direct inference evidence and an original graph neural network is adopted to aggregate features from both concepts and paths simultaneously. We conduct sufficient experiments on the CommonsenceQA and OpenBookQA datasets, showing the effectiveness of MRGNN. Also, we conduct further ablation experiments and explain the reasoning behavior through the case study.

语种：英文

展开

导出

原文链接

认领

SDFormer: A shallow-to-deep feature interaction for knowledge graph embedding

作者： Li, Duantengchuan*;Xia, Tao;Wang, Jing;Shi, Fobo;Zhang, Qi;...

期刊： Knowledge-Based Systems,2024年284:111253 ISSN：0950-7051

通讯作者： Li, Duantengchuan;Li, B;Zhang, Q

作者机构： [Li, Duantengchuan; Li, Bing; Xia, Tao] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China.;[Wang, Jing] Chongqing Univ Posts & Telecommun, Sch Automat, Chongqing 400065, Peoples R China.;[Shi, Fobo] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Zhang, Qi; Zhang, Q] Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China.;[Li, Bing] Hubei Luojia Lab, Wuhan 430079, Peoples R China.

通讯机构： [Li, DTC; Li, B ] W;[Zhang, Q ] C;Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China.;Cent China Normal Univ, Sch Informat Management, Wuhan 430079, Peoples R China.;Hubei Luojia Lab, Wuhan 430079, Peoples R China.

关键词： Link prediction;Knowledge graph embedding;Shallow interaction;Deep interaction;Attention mechanism;Vector tokenization

摘要： Inferring missing information from current facts in a knowledge graph (KG) is the target of the link prediction task. Currently, existing methods embed the entities and relations of KG as a whole into a low-dimensional vector space. Nonetheless, they ignore the multi-level interactions (shallow interactions, deep interactions) among the finer-grained sub-features of entities and relations. To overcome these limitations, we present a shallow-to-deep feature interaction for knowledge graph embedding (SDFormer). It takes into account the interpretability of sub-feature tokens of entities and relations and learns shallow-to-deep interaction information between entities and relations at a more fine-grained level. Specifically, entity and relation vectors are decomposed into sub-features to represent multi-dimensional information. Then, a shallow-to-deep feature interaction method is designed to capture multi-level interactions between entities and relations. This process enriches the feature representation by modeling the interaction between sub-features. Finally, a 1-X scoring function is utilized to calculate the score of each knowledge triplet. The experimental results on several benchmark datasets show that SDFormer obtains competitive performance results and more efficient training efficiency on other comparative models and because of the shallow-to-deep feature interaction between entities and relations.

语种：英文

展开

导出

原文链接

认领

DADL: Double Asymmetric Distribution Learning for head pose estimation in wisdom museum

作者： Zhao, Wanli;Wang, Shutong;Wang, Xiaoguang;Li, Duantengchuan;Wang, Jing;...

期刊： Journal of King Saud University - Computer and Information Sciences,2024年36(1):101869 ISSN：1319-1578

通讯作者： Wang, XG

作者机构： [Wang, Xiaoguang; Zhao, Wanli; Wang, Shutong; Wang, XG] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China.;[Wang, Shutong] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Li, Duantengchuan] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China.;[Wang, Jing] Chongqing Univ Posts & Telecommun, Sch Automat, Chongqing 400065, Peoples R China.;[Wang, Xiaoguang; Wang, XG; Wang, Jing] Wuhan Univ, Intellectual Comp Lab Cultural Heritage, Wuhan 430072, Peoples R China.

通讯机构： [Wang, XG ] W;Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China.;Wuhan Univ, Intellectual Comp Lab Cultural Heritage, Wuhan 430072, Peoples R China.

关键词： Head pose estimation;Label distribution learning;Gaussian distribution;Asymmetric;Feature similarity

摘要： Head pose estimation plays a pivotal role in various applications, including augmented reality and human–computer interaction within intelligent museum environments. Head pose estimation conventionally relies on hard labels. However, acquiring the “ground truth” through subjective means introduces an element of uncertainty into the labels for head pose estimation. The introduction of soft labels offers a potential remedy for this uncertainty. However, existing head pose estimation methods based on soft labels neglect the asymmetry of head pose. After careful observation, two types of asymmetry have been identified in human head pose: within angle and between angle asymmetry. Taking these two characteristics into account, we have devised a Double Asymmetric Distribution Learning (DADL) network model for the precise estimation of head pose angles. This model employs distinct soft label distribution mechanisms to capture within-angle and between-angle nuances in head pose variations. Thereby enhancing the interpretability, generalization capability, and classification accuracy of head pose estimation models. Extensive experiments were conducted on various widely recognized benchmarks, including the AFLW2000 and BIWI datasets. The results substantiate substantial advantages of our model over conventional approaches.

语种：英文

展开

导出

原文链接

认领

Hyperbolic embedding of discrete evolution graphs for intelligent tutoring systems

作者： Liu, Shengyingjie;Yang, Zongkai;Liu, Sannyuya;Liang, Ruxia;Sun, Jianwen;...

期刊： Expert Systems with Applications,2024年241:122451 ISSN：0957-4174

通讯作者： Li, Q

作者机构： [Yang, Zongkai; Shen, Xiaoxuan; Liu, Shengyingjie; Liu, Sannyuya; Li, Qing; Liang, Ruxia; Li, Q; Sun, Jianwen] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Yang, Zongkai; Shen, Xiaoxuan; Liu, Shengyingjie; Liu, Sannyuya; Li, Qing; Liang, Ruxia; Sun, Jianwen] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[Yang, Zongkai; Liu, Sannyuya] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.

通讯机构： [Li, Q ] C;Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.

关键词： Intelligent tutoring systems;Graph embedding;Dynamic graph;Hyperbolic embedding

摘要： Intelligent tutoring systems (ITS) have received much attention recently as online learning has taken off and is replacing offline instruction in many cases. It analyses user behavior and customizes personalized learning strategies for users through artificial intelligence technology. ITS encompasses a variety of entities and multiple relations, making it suitable to be represented as a graph. This perfectly aligns with the utilization of graph embedding (GE) for downstream ITS tasks. Existing GE methods cannot effectively model ITS data because the user evolution in ITS is discrete in time. The patterns of variation in user states are similar to each other but not correlated at the temporal level. Because of the hierarchical structure caused by the discrete evolution, encoding ITS data in a hyperbolic space is more sensible. We define a discrete evolution graph (DEG) to characterize ITS and propose a method called DEGE to embed it. The static nodes in a DEG are projected randomly and then transformed into hyperbolic space. Next, employ hyperbolic evolution networks to generate the embedding of dynamic nodes. The aggregated features of each node are then delivered by hyperbolic aggregation networks and are concatenated to generate the final higher-order features. To validate the superiority, design a multi-objective loss function with preserving pairwise proximity and preserving link types to train the model on several real datasets. The experimental results demonstrate that our method outperforms other baselines on both question annotation and performance prediction in ITS.

语种：英文

展开

导出

原文链接

认领

Analyzing audiovisual data for understanding user's emotion in human-computer interaction environment

作者： Yang, Juan;Li, Zhenkun;Du, Xu

期刊： DATA TECHNOLOGIES AND APPLICATIONS,2023年 ISSN：2514-9288

通讯作者： Yang, J

作者机构： [Yang, J; Yang, Juan] Wuhan Univ Sci & Technol, Coll Comp Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China.;[Li, Zhenkun] Wuhan Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan, Peoples R China.;[Du, Xu] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan, Peoples R China.

通讯机构： [Yang, J ] W;Wuhan Univ Sci & Technol, Coll Comp Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China.

关键词： Key-frame extraction;Audiovisual interaction and fusion;Attention mechanism;Emotion recognition;Intra-modality interaction;Cross-modality interaction

摘要： Purpose - Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their emotional states in daily communication. Therefore, how to achieve automatic and accurate audiovisual emotion recognition is significantly important for developing engaging and empathetic human-computer interaction environment. However, two major challenges exist in the field of audiovisual emotion recognition: (1) how to effectively capture representations of each single modality and eliminate redundant features and (2) how to efficiently integrate information from these two modalities to generate discriminative representations.Design/methodology/approach - A novel key-frame extraction-based attention fusion network (KE-AFN) is proposed for audiovisual emotion recognition. KE-AFN attempts to integrate key-frame extraction with multimodal interaction and fusion to enhance audiovisual representations and reduce redundant computation, filling the research gaps of existing approaches. Specifically, the local maximum-based content analysis is designed to extract key-frames from videos for the purpose of eliminating data redundancy. Two modules, including "Multi-head Attention-based Intra-modality Interaction Module" and "Multi-head Attention-based Cross-modality Interaction Module", are proposed to mine and capture intra- and cross-modality interactions for further reducing data redundancy and producing more powerful multimodal representations.Findings - Extensive experiments on two benchmark datasets (i.e. RAVDESS and CMU-MOSEI) demonstrate the effectiveness and rationality of KE-AFN. Specifically, (1) KE-AFN is superior to state-of-the-art baselines for audiovisual emotion recognition. (2) Exploring the supplementary and complementary information of different modalities can provide more emotional clues for better emotion recognition. (3) The proposed key-frame extraction strategy can enhance the performance by more than 2.79 per cent on accuracy. (4) Both exploring intra- and cross-modality interactions and employing attention-based audiovisual fusion can lead to better prediction performance.Originality/value - The proposed KE-AFN can support the development of engaging and empathetic human-computer interaction environment.

语种：英文

展开

导出

原文链接

认领

How Behavioral and Psychological Factors Influence STEM Performance in K-12 Schools: A Mediation Model

作者： Lu, Chun;Yang, Wei;Wu, Longkai;Yang, Xiao

期刊： Journal of Science Education and Technology,2023年32(3):379-389 ISSN：1059-0145

通讯作者： Xiao Yang

作者机构： [Lu, Chun] Cent China Normal Univ, Minist Educ, Educ Informatizat Strategy Res Base, Wuhan, Hubei, Peoples R China.;[Yang, Wei; Yang, Xiao] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan, Hubei, Peoples R China.;[Wu, Longkai] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan, Hubei, Peoples R China.;[Lu, Chun; Wu, Longkai; Yang, Wei; Yang, Xiao] Cent China Normal Univ, Sci Hall, 152 Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.

通讯机构： [Xiao Yang] N;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan, China<&wdkj&>Science Hall, Central China Normal University, Wuhan, China

关键词： ICT readiness;Internet self-efficacy;Online interaction;STEM performance;Mediating effects

摘要： Understanding factors that influence k-12 students’ Science, Technology, Engineering, and Mathematics (STEM) performance is essential to improving their problem-solving ability. Most studies have focused on the relationship between students’ psychological factors and STEM performance and have paid little attention to the relationship between behavioral factors and STEM performance. This study explored the impact of behavioral factors (i.e., information and communications technology (ICT) readiness and online interaction (OI)) and psychological factors (i.e., internet self-efficacy (ISE)) on k-12 students’ STEM performance. The sample included 851 fifth graders and 535 eighth graders from cities in central China. The results of structural equation modeling analysis showed that ISE and ICT readiness (IR) significantly impacted the STEM performance of eighth graders. More importantly, ISE, a psychological factor, had the greatest effect on STEM performance and played a mediating role in the relationship between IR, OI, and STEM performance. These findings have important implications for STEM teachers. To improve students’ STEM performance, teachers should intervene to improve ISE according to students’ grades and cognitive ability, guide students to use ICT correctly, and encourage them to actively engage in OI.

语种：英文

展开

导出

原文链接

认领

Sparse Bayesian Learning for End-to-End EEG Decoding

作者： Wang, Wenlong;Qi, Feifei;Wipf, David Paul;Cai, Chang;Yu, Tianyou;...

期刊： IEEE Transactions on Pattern Analysis and Machine Intelligence,2023年45(12):15632-15649 ISSN：0162-8828

通讯作者： Yu, ZL;Wu, W

作者机构： [Yu, Tianyou; Yu, Zhuliang; Li, Yuanqing; Wang, Wenlong] South China Univ Technol, Sch Automat Sci & Engn, Guangzhou 510640, Guangdong, Peoples R China.;[Qi, Feifei; Yu, Tianyou; Yu, Zhuliang; Li, Yuanqing; Wang, Wenlong] Pazhou Lab, Guangzhou 510330, Guangdong, Peoples R China.;[Qi, Feifei] Guangdong Univ Finance, Sch Internet Finance & Informat Engn, Guangzhou 510521, Guangdong, Peoples R China.;[Wipf, David Paul] Amazon Shanghai AI Lab, Shanghai 200336, Peoples R China.;[Cai, Chang] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Hubei, Peoples R China.

通讯机构： [Wu, W ] A;[Yu, ZL ] S;South China Univ Technol, Sch Automat Sci & Engn, Guangzhou 510640, Guangdong, Peoples R China.;Pazhou Lab, Guangzhou 510330, Guangdong, Peoples R China.;Alto Neurosci Inc, Loa Altos, CA 94022 USA.

关键词： Electroencephalography;Decoding;Classification algorithms;Finite impulse response filters;Filtering algorithms;Brain modeling;Feature extraction;Electroencephalography (EEG);brain-computer interface (BCI);emotion recognition;decoding;spatio-temporal filtering;sparse Bayesian learning

摘要： Decoding brain activity from non-invasive electroencephalography (EEG) is crucial for brain-computer interfaces (BCIs) and the study of brain disorders. Notably, end-to-end EEG decoding has gained widespread popularity in recent years owing to the remarkable advances in deep learning research. However, many EEG studies suffer from limited sample sizes, making it difficult for existing deep learning models to effectively generalize to highly noisy EEG data. To address this fundamental limitation, this paper proposes a novel end-to-end EEG decoding algorithm that utilizes a low-rank weight matrix to encode both spatio-temporal filters and the classifier, all optimized under a principled sparse Bayesian learning (SBL) framework. Importantly, this SBL framework also enables us to learn hyperparameters that optimally penalize the model in a Bayesian fashion. The proposed decoding algorithm is systematically benchmarked on five motor imagery BCI EEG datasets ( N=192) and an emotion recognition EEG dataset ( N=45), in comparison with several contemporary algorithms, including end-to-end deep-learning-based EEG decoding algorithms. The classification results demonstrate that our algorithm significantly outperforms the competing algorithms while yielding neurophysiologically meaningful spatio-temporal patterns. Our algorithm therefore advances the state-of-the-art by providing a novel EEG-tailored machine learning tool for decoding brain activity.

语种：英文

展开

导出

原文链接

认领

Learning multi-scale features for speech emotion recognition with connection attention mechanism

作者： Chen, Zengzhao;Li, Jiawen;Liu, Hai;Wang, Xuyang;Wang, Hu;...

期刊： Expert Systems with Applications,2023年214:118943 ISSN：0957-4174

通讯作者： Chen, Zengzhao(zzchen@ccnu.edu.cn)

作者机构： [Wang, Hu; Chen, Zengzhao; Li, Jiawen; Liu, Hai; Zheng, Qiuyu] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Li, Jiawen; Zheng, Qiuyu] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.;[Wang, Hu; Chen, Zengzhao; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Wang, Xuyang] Aviat Ind Corp, Luoyang Inst Electroopt Equipment, Luoyang 471023, Henan, Peoples R China.

通讯机构： [Zengzhao Chen] F;Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China<&wdkj&>National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China

关键词： Connection attention mechanism;Features fusion;Frame-level features;Speech emotion recognition;Utterance-level features

摘要： Speech emotion recognition (SER) has become a crucial topic in the field of human–computer interactions. Feature representation plays an important role in SER, but there are still many challenges in feature representation such as the inability to predict which features are most effective for SER and the cultural differences in emotion expression. Most previous studies use a single type of feature for the recognition task or conduct early fusion of features. However, a single type of feature cannot well reflect the emotions of speech signals. Also, different features contain different information, direct fusion cannot integrate the advantages of different features. To overcome these challenges, this paper proposes a parallel network for multi-scale SER based on a connection attention mechanism (AMSNet). AMSNet fuses fine-grained frame-level manual features with coarse-grained utterance-level deep features. Meanwhile, it adopts different speech emotion feature extraction modules according to the temporal and spatial features of speech signals, which enriches features and improves feature characterization. The network consists of a frame-level representation learning module (FRLM) based on the time structure and an utterance-level representation learning module (URLM) based on the global structure. Besides, improved attention-based long short-term memory (LSTM) is introduced into FRLM to focus on the frames that contribute more to the final emotion recognition result. In URLM, a convolutional neural network with the squeeze-and-excitation block (SCNN) is introduced to extract deep features. In addition, the connection attention mechanism is proposed for feature fusion, which applies different weights to different features. Extensive experiments are conducted on the IEMOCAP and EmoDB datasets, and the results demonstrate the effectiveness and performance superiority of AMSNet. Our code will be publicly available at https://elksslcd79b4207c66ae6affb42465c239896eelksslengineer.csmar666.98tsg.com:4443/capsule/8636967/tree/v1. © 2022 Elsevier Ltd

语种：英文

展开

导出

原文链接

认领

Dual-feature-embeddings-based semi-supervised learning for cognitive engagement classification in online course discussions

作者： Liu, Zhi;Kong, Weizheng;Peng, Xian;Yang, Zongkai;Liu, Sannyuya;...

期刊： Knowledge-Based Systems,2023年259:110053 ISSN：0950-7051

通讯作者： Xian Peng<&wdkj&>Zongkai Yang

作者机构： [Yang, Zongkai; Liu, Sannyuya; Liu, Zhi; Kong, Weizheng; Peng, Xian; Liu, Shiqi; Wen, Chaodong] Cent China Normal Univ, Fac Artificial Intelligence Educ, Natl Engn Res Ctr Educ Big Data, Wuhan, Peoples R China.;[Yang, Zongkai; Liu, Sannyuya] Cent China Normal Univ, Fac Artificial Intelligence Educ, Natl Engn Res Ctr Learning, Wuhan, Peoples R China.

通讯机构： [Xian Peng; Zongkai Yang] N;National Engineering Research Center for Educational Big Data, Faculty of Artificial Intelligence in Education, Central China Normal University, PR China<&wdkj&>National Engineering Research Center for E-Learning, Faculty of Artificial Intelligence in Education, Central China Normal University, PR China<&wdkj&>National Engineering Research Center for Educational Big Data, Faculty of Artificial Intelligence in Education, Central China Normal University, PR China

关键词： Cognitive engagement classification;Semi-supervised learning;Dual feature embedding;Linguistic Inquiry and Word Count (LIWC);Course discussion

摘要： Online course discussions contain abundant cognitive information from learners. Previous models required a large amount of labeled data to classify cognitive engagement from the perspective of semantic features alone. However, these models only contain semantic features but cannot fully represent textual information and have poor performance in cases of scarce labeled data. Moreover, cognitive psychological features imply important information that cannot be captured by semantic features. Therefore, this paper proposes a dual feature embedding-based semi-supervised cognitive classification method that exploits the additional inductive biases caused by implicit cognitive features to supplement generic semantic features. Additional inductive biases facilitate the propagation of labeled and unlabeled data and improve the consistency between unlabeled and augmented data. Unsupervised data augmentation (UDA) is used to obtain augmented data by inserting advanced noise into unlabeled data in semi-supervised learning. Furthermore, bidirectional encoder representations from transformers (BERT) are used to extract generic semantics, and linguistic inquiry and word count (LIWC) are adopted to fetch implicit cognitive features from discussion texts. Therefore, we refer to the proposed method as B-LIWC-UDA, sequentially fusing the dual features in the explicit and hidden levels to obtain dual feature embeddings. The cognitive engagement classification model was trained using supervised and consistent training methods. We conducted experiments using datasets obtained from two real-world online course discussions. The experimental results demonstrate that, in terms of major evaluation metrics, the proposed B-LIWC-UDA method performs better than state-of-the-art text classification methods used for identifying cognitive engagement. (c) 2022 Elsevier B.V. All rights reserved.

语种：英文

展开

导出

原文链接

认领

Task-driven cleaning and pruning of noisy knowledge graph

作者： Wu, Chao;Zeng, Zeyu;Yang, Yajing;Chen, Mao*;Peng, Xicheng;...

期刊： Information Sciences,2023年646:119406 ISSN：0020-0255

通讯作者： Chen, Mao;Liu, SNYY

作者机构： [Chen, Mao; Chen, M; Yang, Yajing; Peng, Xicheng; Liu, Sannyuya; Wu, Chao; Liu, SNYY; Zeng, Zeyu] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.

通讯机构： [Chen, M; Liu, SNYY ] C;Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.

关键词： Noisy knowledge graph;Knowledge graph pruning;Multiple inheritance;Taxonomy

摘要： Many knowledge graphs, especially those that are collaboratively or automatically generated, are prone to noise and cross-domain entries, which can impede domain-specific applications. Existing methods for pruning inaccurate or out-of-domain information from knowledge graphs often rely on topological graph-pruning strategies. However, these approaches have two major drawbacks: they may discard logical structure and semantic information, and they allow multiple inheritance. To address these limitations, this study introduces KGPruning, which is a novel approach that can effectively clean and prune noisy knowledge graphs by guiding tasks with a given set of concepts and automatically generating a domain-specific taxonomy. Specifically, KGPruning employs a graph hierarchy inference method that is based on the Agony model to precisely identify and eliminate noisy entries while striving to preserve the underlying hierarchy of semantic relations as much as possible. Furthermore, to establish a tree-structured taxonomy, KGPruning integrates semantic relations and structural characteristics to effectively eliminate out-of-domain informa-tion and multiple inheritance. Through extensive experimental evaluations conducted on open benchmark datasets as well as large-scale real-world problems, the superior performance of KGPruning over state-of-the-art methods is demonstrated on the task of pruning noisy knowledge graphs.

语种：英文

展开

导出

原文链接

认领

The Effect of Smart Classrooms on Project-Based Learning: A Study Based on Video Interaction Analysis

作者： Dai, Zhicheng;Sun, Chengzhang;Zhao, Liang;Zhu, Xiaoliang

期刊： Journal of Science Education and Technology,2023年32(6):858-871 ISSN：1059-0145

通讯作者： Zhao, L

作者机构： [Sun, Chengzhang; Dai, Zhicheng] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Zhao, Liang; Zhu, Xiaoliang] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;[Sun, Chengzhang; Dai, Zhicheng; Zhao, Liang; Zhu, Xiaoliang] Cent China Normal Univ, Fac Artificial Intelligence Educ, 152 Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.

通讯机构： [Zhao, L ] C;Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.;Cent China Normal Univ, Fac Artificial Intelligence Educ, 152 Luoyu Rd, Wuhan 430079, Hubei, Peoples R China.

关键词： Smart classroom;Project-based learning;Video observation;Interaction analysis

摘要： Classroom interaction affects the classroom atmosphere as well as students' behavior and participation, thus affecting the quality of classroom teaching. In traditional classrooms, inherent problems (e.g., inflexible tables and chairs, rigid multimedia consoles, and traditional software) have seriously restricted the overall quality of classroom interpersonal interaction. In recent years, the problem of enhancing classroom interaction has gradually attracted the attention of scholars. The application of project-based learning (PBL) in higher education is effective, but few studies have analyzed the differences in interaction between smart classrooms and traditional classrooms in PBL courses. In this study, through the proposed teacher-student classroom interaction behavior analysis framework, 20 sessions in smart classrooms and 20 sessions in traditional classrooms were encoded to illustrate the differences between interaction in these two types of classrooms. Furthermore, 765 student questionnaires on satisfaction with and participation in smart classrooms were collected to determine whether smart classrooms affect students' satisfaction and participation in PBL courses. The questionnaires were analyzed using SPSS 27.0. The results showed that there were significant differences in four dimensions of teachers' behavior, students' behavior, technology, and other interactions between the smart classroom and the traditional classroom. After taking PBL courses in a smart classroom, students were generally satisfied and thought that the smart learning environment could help them improve their thinking and learning. Suggestions on the further construction and application of smart classrooms are proposed.

语种：英文

展开

导出

原文链接

认领

Orientation Cues-Aware Facial Relationship Representation for Head Pose Estimation via Transformer

作者： Liu, Hai;Zhang, Cheng;Deng, Yongjian;Liu, Tingting;Zhang, Zhaoli;...

期刊： IEEE Transactions on Image Processing,2023年32:6289-6302 ISSN：1057-7149

通讯作者： Zhang, C

作者机构： [Zhang, Cheng; Zhang, C; Zhang, Zhaoli; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Deng, Yongjian] Beijing Univ Technol, Coll Comp Sci, Beijing 100124, Peoples R China.;[Liu, Tingting] Hubei Univ, Sch Educ, Wuhan 430062, Hubei, Peoples R China.;[Liu, Tingting; Li, You-Fu] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China.

通讯机构： [Zhang, C ] C;Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.

关键词： Head;Transformers;Visualization;Computer architecture;Pose estimation;Task analysis;Semantics;Head pose estimation;attention mechanism;relationship perception;deep learning;transformer

摘要： Head pose estimation (HPE) is an indispensable upstream task in the fields of human-machine interaction, self-driving, and attention detection. However, practical head pose applications suffer from several challenges, such as severe occlusion, low illumination, and extreme orientations. To address these challenges, we identify three cues from head images, namely, critical minority relationships, neighborhood orientation relationships, and significant facial changes. On the basis of the three cues, two key insights on head poses are revealed: 1) intra-orientation relationship and 2) cross-orientation relationship. To leverage two key insights above, a novel relationship-driven method is proposed based on the Transformer architecture, in which facial and orientation relationships can be learned. Specifically, we design several orientation tokens to explicitly encode basic orientation regions. Besides, a novel token guide multi-loss function is accordingly designed to guide the orientation tokens as they learn the desired regional similarities and relationships. Experimental results on three challenging benchmark HPE datasets show that our proposed TokenHPE achieves state-of-the-art performance. Moreover, qualitative visualizations are provided to verify the effectiveness of the token-learning methodology.

语种：英文

展开

导出

原文链接

认领

A Target Re-Identification Method Based on Shot Boundary Object Detection for Single Object Tracking

作者： Miao, Bingchen;Chen, Zengzhao;Liu, Hai;Zhang, Aijun

期刊： Applied Sciences-Basel,2023年13(11):6422- ISSN：2076-3417

通讯作者： Zhang, AJ

作者机构： [Chen, Zengzhao; Miao, Bingchen; Liu, Hai] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Chen, Zengzhao; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.;[Chen, Zengzhao] Cent China Normal Univ, Natl Intelligent Soc Governance Expt Base Educ, Wuhan 430079, Peoples R China.;[Zhang, Aijun] China Telecom Corp Henan Branch, Zhengzhou 450016, Peoples R China.

通讯机构： [Zhang, AJ ] C;China Telecom Corp Henan Branch, Zhengzhou 450016, Peoples R China.

关键词： target re-identification;single object tracking;object detection;YOLO;DeepSORT

摘要： With the advantages of simple model structure and performance-speed balance, the single object tracking (SOT) model based on a Transformer has become a hot topic in the current object tracking field. However, the tracking errors caused by the target leaving the shot, namely the target out-of-view, are more likely to occur in videos than we imagine. To address this issue, we proposed a target re-identification method for SOT called TRTrack. First, we built a bipartite matching model of candidate tracklets and neighbor tracklets optimized by the Hopcroft–Karp algorithm, which is used for preliminary tracking and judging the target leaves the shot. It achieves 76.3% mAO on the tracking benchmark Generic Object Tracking-10k (GOT-10k). Then, we introduced the alpha-IoU loss function in YOLOv5-DeepSORT to detect the shot boundary objects and attained 38.62% mAP75:95 on Microsoft Common Objects in Context 2017 (MS COCO 2017). Eventually, we designed a backtracking identification module in TRTrack to re-identify the target. Experimental results confirmed the effectiveness of our method, which is superior to most of the state-of-the-art models. © 2023 by the authors.

语种：英文

展开

导出

原文链接

认领

Actively learning dynamical systems using Bayesian neural networks

作者： Tang, Shengbing;Fujimoto, Kenji;Maruta, Ichiro

期刊： Applied Intelligence,2023年53(23):29338-29362 ISSN：0924-669X

通讯作者： Tang, SB

作者机构： [Tang, SB; Tang, Shengbing] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Fujimoto, Kenji; Maruta, Ichiro] Kyoto Univ, Dept Aeronaut & Astronaut, Kyoto 6158540, Japan.

通讯机构： [Tang, SB ] C;Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.

关键词： Active learning;Bayesian neural network;Dynamical system;Model predictive control

摘要： Learning dynamical systems in a sample-efficient way is important for model-based control. Active learning which sequentially selects the most informative data to sample is capable of greatly reducing sample complexity. The active learning problem for dynamical systems is hard as we can not arbitrarily draw samples from the system’s state space under constraints of system dynamics. The existing approaches model the dynamical systems using Bayesian linear regression or Gaussian processes which can not be applied to complex dynamical systems with high-dimensional state spaces. In this article, we propose a new method to actively learn dynamical systems using Bayesian neural networks which allow for modeling high-dimensional systems with complex dynamics. By maximizing the accumulated differential entropies along the trajectory, the proposed method iteratively searches for the most informative action sequence which will yield informative samples when applied to the real system. With random exploration and model-based reinforcement learning as baselines, we verify the superiority of the proposed method via accuracy of one-step and multi-step predictions, the control performance, the exploration efficiency of the state space on numerical benchmarks. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

语种：英文

展开

导出

原文链接

认领

Incorporating BERT With Probability-Aware Gate for Spoken Language Understanding

作者： Mei, Jie;Wang, Yufan;Tu, Xinhui;Dong, Ming;He, Tingting

期刊： IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,2023年31:826-834 ISSN：2329-9290

作者机构： [Dong, Ming; Tu, Xinhui; Wang, Yufan; Mei, Jie; He, Tingting] Cent China Normal Univ, Hubei Prov Key Lab Artificial Intelligence & Smart, Wuhan 430079, Peoples R China.;[Dong, Ming; Tu, Xinhui; Wang, Yufan; Mei, Jie; He, Tingting] Cent China Normal Univ, Natl Language Resources Monitor & Res Ctr Network, Wuhan 430079, Peoples R China.;[Dong, Ming; Tu, Xinhui; Mei, Jie; He, Tingting] Cent China Normal Univ, Sch Comp Sci & Technol, Wuhan 430079, Peoples R China.;[Wang, Yufan] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.

关键词： Training;Correlation;Bit error rate;Semantics;Natural languages;Logic gates;Filling;Natural language processing;spoken language understanding;intent detection;slot filling

摘要： Spoken language understanding (SLU) is an essential part of a task-oriented dialogue system, which mainly includes intent detection and slot filling. Some existing approaches obtain enhanced semantic representation by establishing the correlation between two tasks. However, those methods show little improvement when applied to BERT, since BERT has learned rich semantic features. In this paper, we propose a BERT-based model with the probability-aware gate mechanism, called PAGM (<underline>P</underline>robability <underline>A</underline>ware <underline>G</underline>ated <underline>M</underline>odel). PAGM aims to learn the correlation between intent and slot from the perspective of probability distribution, which explicitly utilizes intent information to guide slot filling. Besides, in order to efficiently incorporate BERT with the probability-aware gate, we design the stacked fine-tuning strategy. This approach introduces a mid-stage before target model training, which enables BERT to get better initialization for final training. Experiments show that PAGM achieves significant improvement on two benchmark datasets, and outperforms the previous state-of-the-art results.

语种：英文

展开

导出

原文链接

认领

Beyond the Traditional: A Systematic Review of Digital Game-Based Assessment for Students’ Knowledge, Skills, and Affections

作者： Zhu, Sha;Guo, Qing;Yang, Harrison Hao

期刊： Sustainability,2023年15(5):4693- ISSN：2071-1050

通讯作者： Harrison Hao Yang

作者机构： [Zhu, Sha; Yang, Harrison Hao] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.;[Guo, Qing] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.;[Yang, Harrison Hao] SUNY Coll Oswego, Sch Educ, Oswego, NY 13126 USA.

通讯机构： [Harrison Hao Yang] N;National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China<&wdkj&>School of Education, State University of New York at Oswego, Oswego, NY 13126, USA<&wdkj&>Author to whom correspondence should be addressed.

关键词： assessment methodologies;digital games;21st century skills;media in education

摘要： Traditional methods of student assessment (SA) include self-reported surveys, standardized tests, etc. These methods are widely regarded by researchers as inducing test anxiety. They also ignore students' thinking processes and are not applicable to the assessment of higher-order skills. Digital game-based assessment (DGBA) is thought to address the shortcomings of traditional assessment methods. Given the advantages of DGBA, an increasing number of empirical studies are working to apply digital games for SA. However, there is a lack of any systematic review of DGBA studies. In particular, very little is known about the characteristics of the games, the content of the assessment, the methods of implementation, and the distribution of the results. This study examined the characteristics of DGBA studies, and the adopted games on SA in the past decade from different perspectives. A rigorous systematic review process was adopted in this study. First, the Web of Science (WOS) database was used to search the literature on DGBA published over the last decade. Then, 50 studies on SA were selected for subsequent analysis according to the inclusion and exclusion criteria. The results of this study found that DGBA has attracted the attention of researchers around the world. The participants of the DGBA studies were distributed across different educational levels, but the number of participants was small. Among all game genres, educational games were the most frequently used. Disciplinary knowledge is the most popular SA research content. Formative assessment modeling with process data and summative assessment using final scores were the most popular assessment methods. Correlation analysis was the most popular analysis method to verify the effectiveness of games on SA. However, many DGBA studies have reported unsatisfactory data analysis results. For the above findings, this study further discussed the reasons, as well as the meanings. In conclusion, this review showed the current status and gaps of DGBA in the SA application; directional references for future research of researchers and game designers are also provided.

语种：英文

展开

导出

原文链接

认领

SMFNM: Semi-supervised multimodal fusion network with main-modal for real-time emotion recognition in conversations

作者： Yang, Juan;Dong, Xuanxiong;Du, Xu

期刊： Journal of King Saud University - Computer and Information Sciences,2023年35(9):101791 ISSN：1319-1578

通讯作者： Yang, J

作者机构： [Yang, J; Dong, Xuanxiong; Yang, Juan] Wuhan Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430065, Hubei, Peoples R China.;[Du, Xu] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.

通讯机构： [Yang, J ] W;Wuhan Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430065, Hubei, Peoples R China.

关键词： Real-time Emotion recognition in conversations;Semi-supervised learning;Main modal;Multimodal interaction;Multimodal fusion network

摘要： Real-time emotion recognition in conversations (ERC), which relies on only the historical utterances to achieve ERC, has recently gained increasing attention due to its significance in providing real-time empathetic services. Although utilizing multimodal information can mitigate the issues of unimodal approaches, few real-time ERC studies consider the differences in representation ability of different modalities and explore comprehensive conversational context from different perspectives based on different structures. Furthermore, the heavy annotation cost makes it difficult to collect sufficient labeled data, which also limits the performance of current supervised ERC approaches. To address these issues, we propose a novel framework SMFNM for real-time ERC, which integrates semi-supervised learning with multimodal fusion under the guidance of main-modal. Specifically, SMFNM utilizes additional unlabeled data to extract high-quality intra-modal representations, and implements cross-modal interaction to capture complementary information to enhance the audio representations. Then SMFNM employs the directed acyclic graph and the Gated Recurrent Units for exploring more accurate conversational context from both the multimodal and main-modal perspectives, respectively. Finally, these two types of contextual features are fused for emotion identification. Extensive experiments on benchmark datasets (i.e., IEMOCAP (4-way), IEMOCAP (6-way) and MELD) demonstrate the effectiveness, superiority and rationality of our SMFNM.(c) 2023 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

语种：英文

展开

导出

原文链接

认领

12 3 4 5 6 7... 14 共 14 页

成果认领

标题：

用户	作者	通讯作者	--
	请选择	请选择	--

确定

取消

国家数字化学习工程技术研究中心

成果认领

提示

该栏目需要登录且有访问权限才可以访问