FCC-MF: Detecting Violence in Audio-Visual Context with Frame-Wise Cluster Contrast and Modality-Stage Flooding

首页 > 成果 > 详情

认领

导出

Link by DOI

反馈

作者信息关键词期刊信息会议信息基础信息归属信息摘要

成果类型：

会议论文

作者：

Jiaqing He;Yanzhen Ren;Liming Zhai;Wuyang Liu

作者机构：

[Jiaqing He; Wuyang Liu] School of Cyber Science and Engineering, Wuhan University

Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education

[Liming Zhai] School of Computer Science, Central China Normal University

[Yanzhen Ren] Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education<&wdkj&>School of Cyber Science and Engineering, Wuhan University

语种：

英文

关键词：

Violence detection;weakly-supervised learning;audio-visual fusion;contrastive leaning

年：

2024

页码：

8346-8350

会议名称：

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

会议论文集名称：

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

会议时间：

14 April 2024

会议地点：

Seoul, Korea, Republic of

出版者：

IEEE

ISBN：

979-8-3503-4486-8

DOI：

10.1109/ICASSP48485.2024.10447086

机构署名：

本校为其他机构

院系归属：

计算机学院

摘要：

This paper explores the detection of frame-wise instances of violence in both audio and visual modalities, where only clip-level labels are available. Previous works selected fixed value of frames for objective optimization to model frame-level features, and applied straightforward fusion strategy to aggregate audio and visual information. However, these two issues, namely Constant Frames Selection and Vulnerable Fusion, significantly impair the network’s detection performance. To address these issues, we present a novel framework called Frame...

反馈

产权有误：本人成果被他人认领

数据有误：数据基本信息有误

归属有误：成果的院系归属、机构署名归属有误

其他原因：

验证码：

看不清楚，换一个

确定

取消

成果认领

标题：

用户	作者	通讯作者	--
	请选择	请选择	--

确定

取消

FCC-MF: Detecting Violence in Audio-Visual Context with Frame-Wise Cluster Contrast and Modality-Stage Flooding

反馈

成果认领

提示

该栏目需要登录且有访问权限才可以访问