Publikationen

Autor	Schäfer, Karla
Datum	2026
Art	Conference Paper
Abstrakt	With the increasing use of AI, deepfakes are becoming an increasingly prevalent threat in today’s world. At the same time, the performance of most detectors drops significantly when faced with unseen data, whereas generation models are improving, resulting in fewer artefacts. We examined deepfakes published on Instagram, using the SocialDF dataset. In addition to analysing the deepfakes in the frequency domain using audio deepfake detectors, we transcribed the speech and analysed the text (e.g. emotion and topics) and the audio content (e.g. emotion and music genre). We found that audio deepfake detectors struggle to identify real-world deepfakes on Instagram. Furthermore, current audio deepfake detection uses audio artefacts only. Content is not used for detection purposes. We suggest using both the speech recording and the content. This approach improves results on real-world data and provides an explanation for the classification. Using content information, we outperformed frequency-based detection with an F1-score of 74.3%.
Konferenz	European Conference on Information Retrieval 2026
Url	https://publica.fraunhofer.de/handle/publica/513378

Text Vs. Speech? Detecting Audio Deepfakes on Instagram