版权说明 操作指南
首页 > 成果 > 详情

Learning multi-scale features for speech emotion recognition with connection attention mechanism

认领
导出
Link by DOI
反馈
分享
QQ微信 微博
成果类型:
期刊论文
作者:
Chen, Zengzhao;Li, Jiawen;Liu, Hai;Wang, Xuyang;Wang, Hu;...
通讯作者:
Chen, Zengzhao(zzchen@ccnu.edu.cn)
作者机构:
[Wang, Hu; Chen, Zengzhao; Li, Jiawen; Liu, Hai; Zheng, Qiuyu] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.
[Li, Jiawen; Zheng, Qiuyu] Cent China Normal Univ, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China.
[Wang, Hu; Chen, Zengzhao; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr E Learning, Wuhan 430079, Peoples R China.
[Wang, Xuyang] Aviat Ind Corp, Luoyang Inst Electroopt Equipment, Luoyang 471023, Henan, Peoples R China.
通讯机构:
[Zengzhao Chen] F
Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China<&wdkj&>National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China
语种:
英文
关键词:
Connection attention mechanism;Features fusion;Frame-level features;Speech emotion recognition;Utterance-level features
期刊:
Expert Systems with Applications
ISSN:
0957-4174
年:
2023
卷:
214
页码:
118943
基金类别:
The authors thank the editor and anonymous reviewers for their valuable suggestions. This work has been supported by the National Natural Science Foundation of China (Grant No. 62077022 , 61875068 , 62211530433 , 62177018 , 62011530436 , 62277041 , 62005092 , 62077020 ), the National Teacher Development Collaborative Innovation Experimental Base Construction Research Project of Central China Normal University (No. CCNUTEIII 2021-21 ), and the National Key R&D Program of China ( 2021YFC3340802 ). We would like to thank Dr. Tingting Liu and Prof. Zhaoli Zhang for assistance in advice.
机构署名:
本校为第一机构
院系归属:
国家数字化学习工程技术研究中心
摘要:
Speech emotion recognition (SER) has become a crucial topic in the field of human–computer interactions. Feature representation plays an important role in SER, but there are still many challenges in feature representation such as the inability to predict which features are most effective for SER and the cultural differences in emotion expression. Most previous studies use a single type of feature for the recognition task or conduct early fusion of features. However, a single type of feature cannot well reflect the emotions of speech signals. A...

反馈

验证码:
看不清楚,换一个
确定
取消

成果认领

标题:
用户 作者 通讯作者
请选择
请选择
确定
取消

提示

该栏目需要登录且有访问权限才可以访问

如果您有访问权限,请直接 登录访问

如果您没有访问权限,请联系管理员申请开通

管理员联系邮箱:yun@hnwdkj.com