版权说明 操作指南
首页 > 成果 > 详情

FCC-MF: Detecting Violence in Audio-Visual Context with Frame-Wise Cluster Contrast and Modality-Stage Flooding

认领
导出
Link by DOI
反馈
分享
QQ微信 微博
成果类型:
会议论文
作者:
Jiaqing He;Yanzhen Ren;Liming Zhai;Wuyang Liu
作者机构:
[Jiaqing He; Wuyang Liu] School of Cyber Science and Engineering, Wuhan University
Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education
[Liming Zhai] School of Computer Science, Central China Normal University
[Yanzhen Ren] Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education<&wdkj&>School of Cyber Science and Engineering, Wuhan University
语种:
英文
关键词:
Violence detection;weakly-supervised learning;audio-visual fusion;contrastive leaning
年:
2024
页码:
8346-8350
会议名称:
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
会议论文集名称:
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
会议时间:
14 April 2024
会议地点:
Seoul, Korea, Republic of
出版者:
IEEE
ISBN:
979-8-3503-4486-8
机构署名:
本校为其他机构
院系归属:
计算机学院
摘要:
This paper explores the detection of frame-wise instances of violence in both audio and visual modalities, where only clip-level labels are available. Previous works selected fixed value of frames for objective optimization to model frame-level features, and applied straightforward fusion strategy to aggregate audio and visual information. However, these two issues, namely Constant Frames Selection and Vulnerable Fusion, significantly impair the network’s detection performance. To address these issues, we present a novel framework called Frame...

反馈

验证码:
看不清楚,换一个
确定
取消

成果认领

标题:
用户 作者 通讯作者
请选择
请选择
确定
取消

提示

该栏目需要登录且有访问权限才可以访问

如果您有访问权限,请直接 登录访问

如果您没有访问权限,请联系管理员申请开通

管理员联系邮箱:yun@hnwdkj.com