Fake News Detection with the New German Dataset "GermanFakeNC

AutorVogel, Inna; Jiang, Peter
ArtConference Paper
AbstraktThe spread of misleading information and “alternative facts” on the internet gained in the last decade considerable importance worldwide. In recent years, several attempts have been made to counteract fake news based on automatic classification via machine learning models. These, however, require labeled data. The scarcity of available corpora for predictive modeling is a major stumbling block in this field, especially in other languages than English. Our contribution is twofold. First, we introduce a new publicly available German dataset “German Fake News Corpus” (GermanFakeNC) for the task of fake news detection which consists of 490 manually fact-checked articles. Every false statement in the text is verified claim-by-claim by authoritative sources. Our ground truth for trustworthy news consists of 4,500 news articles from well-known mainstream news publishers. With regard to the second contribution, we choose a Convolutional Neural Network (CNN) (κ = 0.89) and the widely used SVM (κ = 0.72) technique to detect fake news. Thus we hope that our approach will stimulate the progress in fake news detection and claim verification across languages.
KonferenzInternational Conference on Theory and Practice of Digital Libraries (TPDL) <23, 2019, Oslo>
ReferenzDoucet, A.: Digital Libraries for Open Knowledge. 23rd International Conference on Theory and Practice of Digital Libraries, TPDL 2019. Proceedings: Oslo, Norway, September 9-12, 2019. Cham: Springer Nature, 2019. (Lecture Notes in Computer Science 11799), pp. 288-295
SchlüsselISBN : 9783030307608