Linguistic evidence of plagiarism in Spanish journalism

  • Sheila Queralt Laboratorio SQ Lingüistas Forenses (España)
  • Montse Marquina Zarauza Laboratorio SQ Lingüistas Forenses (España)
  • Roser Giménez García Laboratorio SQ Lingüistas Forenses (España)
Keywords: Journalism, information, plagiarism, forensic linguistics.

Abstract

The expertise of a forensic linguist is required in legal disputes about possible plagiarism cases. Studies in plagiarism detection have established a maximum threshold of 50% of lexical similarity in independently produced texts. This paper explores the possibility that journalistic articles require a specific similarity threshold since they share informative content (“what”, “who”, “when”, “where”, “how”, and “why”). In order to do this, 4 quantitative linguistic variables are applied to two corpora structured around 10 different topics: a corpus of study comprising 50 articles and a case corpus including 20 texts from a real case. On the basis of the former, thresholds for each variable reflecting the expectable coincidence percentages between independent texts are extracted. These thresholds are then applied to the corpus of the case to determine whether the new thresholds allow for all the plagiarism cases to be detected.

Downloads

Download data is not yet available.
View citations

Crossmark

Metrics

Published
2018-11-05
How to Cite
Queralt S., Marquina Zarauza M. y Giménez García R. (2018). Linguistic evidence of plagiarism in Spanish journalism. Estudios sobre el Mensaje Periodístico, 24(2), 1559-1578. https://doi.org/10.5209/ESMP.62234
Section
Articles