| Abstrakt | Authorship Verification (AV) is a key area of research in digital text forensics, which addresses the fundamental question of whether two texts were written by the same person. Numerous computational approaches have been proposed over the last two decades in an attempt to address this challenge. However, existing AV methods often suffer from high complexity, low explainability, and especially from a lack of clear scientific justification. We propose a simpler method based on modeling the grammar of an author following Cognitive Linguistics principles. These models are used to calculate λG (LambdaG): the ratio of the likelihoods of a document given the candidate’s grammar versus given a reference population’s grammar. Our empirical evaluation, conducted on 12 datasets and compared against seven baseline methods, demonstrates that LambdaG achieves superior performance, including against several neural network-based AV methods. LambdaG is also robust to small variations in the composition of the reference population and provides interpretable visualizations, enhancing its explainability. We argue that its effectiveness is due to the method’s compatibility with Cognitive Linguistics theories, predicting that a person’s grammar is a behavioral biometric. |
|---|