| Abstrakt | Like humans, who are susceptible to illusions, adversarial attacks can deceive audio deepfake detectors into making incorrect judgements. We trained and tested various detection tools against different types of adversarial attack that have been developed in recent years. Previous studies have used the equal error rate to evaluate the effectiveness of a given attack. As it is important to determine whether an attack is effective at tricking the detector into classifying spoofs as bona-fide or vice versa, we used various evaluation metrics to perform a more in-depth analysis. We investigated three adversarial attacks: FGSM, PGDL2 and Malafide. Furthermore, we created an adaptive version of Malafide with filter sizes that change depending on the success of a given attack. The detection models used were LCNN, SpecRNet, RawNet3 and an SSL-based detector consisting of a combination of Wav2Vec2 and AASIST. This selection provides a broad range of detector structures. All detectors were affected by FGSM and PGDL2 in the white box setting; LCNN and SpecRNet the most. Malafide led to a deterioration in white and black box settings, fooling LCNN and SpecRNet into classifying spoofs as bona-fide. Conversely, RawNet3 and Wav2Vec2+AASIST were tricked by Malafide into classifying bona-fide samples as spoofs, resulting in a high number of false alarms while still correctly identifying spoofs. Both models were affected by the filter size; higher-quality degradation in the samples led to worse results, with Wav2Vec2+AASIST being more affected than RawNet3. Through adaptive adversarial training, the detectors partially improved their performance while maintaining their performance on the original data. |
|---|