글번호
912701

Low-quality Fake Audio Detection through Frequency Feature Masking / 곽일엽 교수 (중앙대학교)

작성자
kustatri
조회수
895
등록일
2023.10.23
수정일
2024.02.13
2023년 4월 14일 진행

Abstract : 
The first Audio Deep Synthesis Detection Challenge (ADD 2022) competition was held which dealt with audio deep fake detection, audio deep synthesis, audio fake game, and adversarial attacks. Our team participated in track 1, classifying bona fide and fake utterances in noisy environments. Through exploratory data analysis, we found that noisy signals appear in similar frequency bands for given voice samples. If a model is trained to rely heavily on information in frequency bands where noise exists, performance will be poor. In this presentation, we propose a data augmentation method, Frequency Feature Masking (FFM) that randomly masks frequency bands. FFM makes a model robust by not relying on specific frequency bands and prevents overfitting. We applied FFM and mixup augmentation on five spectrogram-based deep neural network architectures that performed well for spoofing detection using mel-spectrogram and constant Q transform (CQT) features. Our best submission achieved 23.8% in EER and ranked 3rd on track 1. To demonstrate the usefulness of our proposed FFM augmentation, we further experimented with FFM augmentation using ASVspoof 2019 Logical Access (LA) datasets.
이전글
Function-on-function sufficient dimension reduction with weak conditional moments / 송준 교수 (고려대학교)
다음글
Bayesian nonparametric adjustment of confounding / 김찬민 교수 (성균관대학교)