Fraunhofer SIT at CheckThat! 2023: Enhancing the Detection of Multimodal and Multigenre Check-Worthiness Using Optical Character Recognition and Model Souping

AuthorFrick, Raphael; Vogel, Inna; Choi, Jeong-Eun
TypeConference Paper
AbstractThis paper describes the approach developed by the Fraunhofer SIT team in the CLEF-2023 CheckThat! lab challenge for check-worthiness detection in multimodal and unimodal content. Check-worthiness detection aims to facilitate manual fact-checking efforts by prioritizing the statements that fact-checkers should consider first. It can also be seen as the first step of a fact-checking system. Our approach was ranked first in Task 1A and second in Task 1B. The goal of Task 1A is to determine whether a claim in a tweet that contains both a snippet of text and an image is worth fact-checking. For this task, we propose a novel way to detect check-worthiness. It takes advantage of two classifiers, each trained on a single modality. For image data, extracting the embedded text with an OCR analysis has shown to perform best. By combining the two classifiers, the proposed solution was able to place first in Task 1A with an ?1 score of 0.7297 achieved on the private test set. The aim of Task 1B is to determine whether a text snippet from a political debate it should be assessed for check-worthiness. Our bestperforming method takes advantage of an ensemble classification scheme centered on Model Souping. When applied to the English data set, our submitted model achieved an overall ?1 score of 0.878 and was ranked as the second-best model in the competition.
ConferenceConference and Labs of the Evaluation Forum 2023