MSRANet: Learning discriminative embeddings for speaker verification via channel and spatial attention mechanism in alterable scenarios

首页 > 成果 > 详情

认领

导出

Link by DOI

反馈

作者信息关键词期刊信息基础信息归属信息摘要

成果类型：

期刊论文

作者：

Zheng, Qiuyu;Chen, Zengzhao;Liu, Hai;Lu, Yuanyuan;Li, Jiawen;...

通讯作者：

Zengzhao Chen

作者机构：

[Lu, Yuanyuan; Chen, Zengzhao; Li, Jiawen; Zheng, Qiuyu; Liu, Hai] Cent China Normal Univ, Fac Artificial Intelligence Educ, Wuhan 430079, Peoples R China.

[Lu, Yuanyuan; Li, Jiawen; Zheng, Qiuyu] Cent China Normal Univ, Natl Engn Res Ctr Educ Big Data, Wuhan 430079, Peoples R China.

[Chen, Zengzhao; Liu, Hai] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China.

[Liu, Tingting] Hubei Univ, Sch Educ, Wuhan 430062, Peoples R China.

通讯机构：

[Zengzhao Chen] F

Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China<&wdkj&>National Engineering Research Center for E-Learning, Central China Normal University, Wuhan 430079, China

语种：

英文

关键词：

Alterable scenarios;Attention mechanism;Embedding extraction;Frame-level features;Speaker verification

期刊：

Expert Systems with Applications

ISSN：

0957-4174

年：

2023

卷：

217

页码：

119511

DOI：

10.1016/j.eswa.2023.119511

基金类别：

The authors thank the editor and anonymous reviewers for their valuable suggestions. This work has been supported by the National Natural Science Foundation of China (Grant No. 62077022 , 61875068 , 62211530433 , 62177018 , 62011530436 , 62277041 , 62005092 , 62077020 ), the National Teacher Development Collaborative Innovation Experimental Base Construction Research Project of Central China Normal University (No. CCNUTEIII 2021-21 ), and the National Key R&D Program of China ( 2021YFC3340802 ), and was supported in part by the National Natural Science Foundation of Hubei Province, China under Grant (No. 2022CFB971 ), the China Unicom Hubei Branch Bilateral Cooperation Research Funds under Grant 2021111002002004 and “Universities Helping Counties” Research Funds of Hubei Province, China under Grant BXLBX0192 .

机构署名：

本校为第一机构

院系归属：

国家数字化学习工程技术研究中心

摘要：

Speaker embeddings have become the most popular feature representation in speaker verification. Improving the robustness of speaker embedding extraction systems is a crucial problem. A multi-scale residual aggregation network (MSRANet), which is a simple but efficient network with triplet input and triplet loss, is proposed in this paper. Two different aggregation strategies are utilized in frame-level feature extractors to capture long-term variations in speaker characteristics. Attention mechanism is employed to filter a large number of parameters in temporal and frequency dimensions, which ...

反馈

产权有误：本人成果被他人认领

数据有误：数据基本信息有误

归属有误：成果的院系归属、机构署名归属有误

其他原因：

验证码：

看不清楚，换一个

确定

取消

成果认领

标题：

用户	作者	通讯作者	--
	请选择	请选择	--

确定

取消

MSRANet: Learning discriminative embeddings for speaker verification via channel and spatial attention mechanism in alterable scenarios

反馈

成果认领

提示

该栏目需要登录且有访问权限才可以访问