The publication of press releases as journalistic information. Comparative study of two Spanish newspapers

The distinction of what constitutes a “news event” can give rise to many interpretations. In this world of telematic accessibility, which is a consequence of globalization, events and occurrences of all kinds can be categorized as news simply by dressing them up as news. According to the style and communications manuals, the news has its own characteristics: relevance, social interest and proximity, among others. Press releases have become perfected as a result of increasingly sophisticated public relations agencies, and with them the thin line between information and advertising is now blurred. In this paper, we compare press releases issued by public and private companies with news briefs in the Economy sections of newspapers. As will be seen, many of them coincide and bear some similarities. The sample uses news briefs published during the first half of 2014 in El Mundo and La Vanguardia, Spanish pay-to-read newspapers that feature prominently in the analysis by the Estudio General de Medios (General Media Study). The methodology makes use of the Maple program with its DetectPlagiarism command to perform an ad hoc comparison of the texts. The default copy threshold for DetectPlagiarism is 0.35. The similarity indices between the news briefs and the press releases of La Vanguardia and El Mundo indicate values higher than this threshold.


Introducción
A group of journalism students under the guidance of reporter Wendy Bacon anlysed the content of a dozen Australian newspapers for a week, and they concluded that more than half of the news published (55%) had originated in a press release. Their re-search, sponsored by the Australian Center for Independent Journalism and the Crikey portal, led them to discover that there were staff journalists who signed pieces that were practically literal reproductions of information sent to them by companies and institutions. Of the 2,203 articles analysed, more than 500 contained no additional points of view, sources or content beyond what was provided by the press release (Bacon, 2010).
This issue has become a major concern for media researchers, especially in precarious markets such as Spain, where editors are required to produce much while relying on scant resources and personnel.
Drawing on the high degree of "manipulation" that occurs in the informational chain (Durandin, 1982) as well as the digital chain (Scholz, 2013), other authors have made useful contributions about the use and abuse of agency information in a venue that is underscored by urgency and scarcity of means: the electronic editions of the newspapers.
It is for this reason that some authors refer to the misinformation that is generated in this way (Starr, 2004(Starr, , 2009, because it is their opinion that what is really disseminated by media that operate on all platforms is advertising, propaganda and promotional content, which is ultimately not information that has been corroborated and prepared rigorously (Schwoebel, 1971, McChesney, 1999, Manning, 2001. However, no works to date have focused exclusively on this widespread practice in regard to the news brief, a genre that is frequently subject to this practice, as will be seen below. This line of research was not even pursued by researchers who have suggested it in their own papers (De Fontcuberta, 2000: 16, Parratt, 2008. The news brief is not listed in the Teoría de los géneros periodísticos [Theory of the journalistic genres], by Llorenç Gomis (2008). However, it obviously exists. It is often presented as the antithesis of reporting, the "all-embracing" genre, the quintessential star (García Márquez, 2010). Nevertheless, news briefs fulfil a function as important as any other format. Authors such as Grijelmo (2006) highlight that it is a news item that barely consists of a lede, that is, the "essential core of the news". Therefore, it can be understood not as a summary, but as an outline of an informative event that is communicated with great economy of words. Its function is to provide an immediate and clear account of what is fundamental while generating interest and attracting attention so that the public can continue to read, watch and listen (Jaraba, 2009).
Generally, the news brief covers two of the 6 W's (who, what, when, where, why and how)-namely, who and what. But this is not a fixed norm: sometimes it is necessary that it attends to other W's, especially if it is the follow-up to a news item that has been gaining traction in previous days. In any event, it can be written neatly and elegantly. For this reason, Grijelmo asks that the editors and copywriters be attentive to the news briefs that often appear in the papers so that all doubts about the news can be cleared up. In any case, the information in the brief-which is not the same as brief information-should follow the so-called informative style, which is characterized by being "sober and concise, objective, where there is no place for the journalist's ego" (De Fontcuberta, 2000).
The newspaper stylebooks studied in this research treat the news brief in the following way. La Vanguardia (2004) reduces it to a subsection of the news genre (the others are the report, the interview, the chronicle, the analysis, the review and the op-ed): The news briefs (or newsflashes) are pieces that gather news in short form. Grouped into a block of one or two columns, news briefs (minimum of two) have a similar extension and must contain all the minimum elements for understanding the information. They have no line breaks. These pieces are not dated, but authorship must be indicated at the end, separated from the text by a long dash between spaces: if it is from an editor, it is signed with initials or with an initial and surname (never with the whole name), or with the word "Editorial"; if it is an agency, the corresponding agency is indicated (La Vanguardia, 2004: 34).
The news brief is not listed independently in the El Mundo (2002) style book, but it does explain the content of what the "basic" news should be: The news or basic information. This was the most common genre in the daily press, which applied more strict considerations to phrases and short paragraphs, a straight opening and developing the story in a way that can be chronological or pyramidal, depending on the greater or lesser complexity of the informative elements (El Mundo, 2002: 6).

Objectives and methodology
The objective of this research is to compare the press releases written by companies and public entities with news briefs about these organizations published in the Economy sections of the media included in this study (El Mundo and La Vanguardia). This comparison is intended to detect whether the press release is processed and corroborated before being published or if, on the contrary, the press release is distributed in the same form as it arrives in the newsroom. In this way we seek to establish numerically what percentage of the published news brief coincides with or is similar to the press release.
Although our efforts are not intended to accuse newspapers or news agencies of plagiarism, the tools that will be used have indeed been conceived for detecting plagiarism. In recent years, a large number of programs and applications have been created for this purpose, which intends to reduce and avoid copying in publications such as scientific journals (Sánchez-Vega, Villatoro-Tello, Montes-y-Gómez, Villaseñor-Pineda and Rosso, 2013).
One of the most widespread tools is iThenticate, whose comparative database contains billions of articles that the editors of more than five hundred media sources use to detect plagiarism and authorship problems.
Bibliotècnica (https://bibliotecnica.upc.edu), the digital library of the Universitat Politècnica de Cata-lunya -BarcelonaTech, has drawn up an extensive list of programs that can be found on the internet for detecting plagiarism. In addition, there are programs that are content verification tools for journalists, such as Plagtracker, which is recommended by the International Center for Journalists.
In this environment, it is advisable to be familiar with the Maple computer algebra system program that is owned by Maplesoft. It is also helpful to understand the command concept in computer programs and applications, as well as each of the functions and orders whose purposes are to perform specific tasks.
More specifically in regard to the framework of this investigation's methodology, some of the Maple commands used are described below: • SimilarityScore: Compares the use of common words between two texts and it returns a value or score between 0 and 1. A score of 0 indicates that there is no common word between both texts; a score of 1 indicates that there is a total match of words. • JaccardCoefficient: Compares two texts. The result is a number between 0 and 1. The CosineCoefficient and DiceCoefficient commands are similar. • DetectPlagiarism: Establishes whether plagiarism exists within two texts.
The sample covered the first half of 2014. El Mundo and La Vanguardia were chosen because these general information pay-to-read papers in Spain have more business news briefs as a fixed part of a section (Economy). In terms of the number of daily readers for this category, they occupy the second and third positions, respectively, behind El País-which was discarded because it lacked both the volume and the desired regularity in publishing news briefs in their Economy section (AIMC, 2017: 8). In some cases, the same news brief appeared in both newspapers on the same day. However, our priority lied not in whether news briefs coincided in the newspapers being compared, but rather to analyse news briefs sourced from a company press release, that is, briefs that originated with a commercial enterprise and were sent to the paper for promotional purposes.
News briefs were chosen at random in order to increase the probability of arriving at findings more likely to be representative of a group (Csikszentmihalyi and Larson, 2014: 35-54). The selection of news briefs followed the following process. We began with the Wednesday, January 1, 2014 editions of the two newspapers. This was followed by taking a selection from the next week day, Thursday, but for the following week, i.e., Thursday, January 9, 2014. Thus, in a scaled fashion, we covered half a year for the two dailies.
In most cases, the authorship of the news brief was an agency. Europa Press is the agency preferred by La Vanguardia and El Mundo in their respective Economy sections, more so than EFE. They also use the signature "Agencies" (without specifying) or "Editorial Staff" (or, in the case of El Mundo, E. M.) As explained previously, there are many applications for detecting plagiarism, and research in this field is extensive and growing (Barrón-Cedeño, Gupta and Rosso, 2013). This investigation compares the news brief with the press release or the agency dispatch-which can easily be located on the internet. Thus, the problem was less complex than what is generally the case with scientific literature.
Based on these considerations, we chose Maple's EssayTools library for detecting plagiarism in the field of journalism. This tool was chosen because it met the needs of this study-namely, comparing two texts-and because of the programming's flexibility in the Maple environment.
In this context, the SimilarityScore command is used to compare the use of words in two or more texts, and it returns a matrix of scores for each pair of texts. A score of 0 indicates that there is no overlap between the two. Conversely, a score of 1 indicates that there is a total overlap. However, a score of 1 does not necessarily imply that both texts are identical.
If for example the texts "The tall day" and "The tall dog" are considered and a list is made with all the different words that appear in both texts, the list is [the, day, tall, dog]. Next, two vectors are generated that indicate how many times these words appear in the texts. The vector for the first text is [1 1 1 0], since "the" appears once, as well as "day" and "tall", while "dog" does not appear. The vector for the second text is [1 0 1 1]. These two vectors will be used to perform a mathematical operation, which will be different depending on the method used. In the case of Cosine Coefficient and given two vectors v 1 and v 2 , the cosine coefficient is calculated as the scalar product of both vectors between the product of their norms, that is: where: For the vectors that were used as an example, we have:

Thus:
Each of these three coefficients has a binary version in which the number of times a word appears is not important, but whether or not it does appear is indeed important.
DetectPlagiarism compares texts through the SimilarityScore command and returns those that exceed a certain similarity threshold. It returns the texts that are so similar that they are likely to have been copied or they contain some portion of the text that has been copied. This is a probabilistic measure and does not necessarily imply that two texts with a high score are indeed copies of each other or that they have been copied from the same source. The default threshold in the DetectPlagiarism command is 0.35 and the similarity metric is BinaryJaccard-Coefficient. Both the threshold and similarity metric can be modified to reach any value between 0 and 1 (Wang, Qi, Kong and Nu, 2013).
When comparing two texts, the length of both is important. More precisely, the result of the comparison can be altered if a very short text is compared with another very long text. To solve this problem while keeping in mind that for the present investigation the shortest text is always the news brief, a recurrent comparison was made between the news brief and an increasingly large part of the press release or the wire service dispatch. This was carried out through an algorithm whose result is the maximum value of the similarity index and the number of words that maximize this similarity to the press release or wire dispatch.
Using this methodology, the statistical study was carried out on a sample of 52 news briefs from the newspapers La Vanguardia and El Mundo: 30 pieces by La Vanguardia and 22 pieces by El Mundo. Henceforth, the abbreviations used correspond to La Vanguardia (LV), El Mundo (EM), press release (PR) and a dispatch or news release from a news agency (NA). A cross indicates the existence of a press release and/or an agency dispatch / news release (Table I). Table I. The statistical study was carried out on a sample of 52 news briefs from the newspapers La Vanguardia (LV) and El Mundo (EM) After collecting the data (carried out in the first half of 2014), it was systematically ordered and analysed, work which lasted into the second half of 2014. To expand the scope of the research (Wolf, 1987;Wimmer and Dominick, 2013), a series of in-depth semistructured interviews were conducted. This was done throughout 2015 and the first quarter of 2016 in a first round, then a second round took place in the first four months of 2018. The interpretation of the participants in this second phase was fundamental to understanding all the nuances of the numerical results obtained (Busquet, Medina and Sort, 2006), for which we relied on the following profile of interviewees: workers in the media being studied, both editors and department or section managers; employees and managers in press offices and com-munications agencies; and journalism teachers and researchers with experience in this professional field. In total, twenty people were interviewed, equally divided between women and men with varying degrees of experience: at one end of the spectrum there was a corporate communications assistant with less than ten years of experience, and at the other a newspaper executive with several decades of experience.

Results
We separately compared news briefs with press releases; news briefs with agency dispatches; and press releases with either dispatches or news releases from agencies. Of the 30 news briefs analysed from La Vanguardia, 19 resemble the press releases of the multinationals. If an intermediary exists between the news brief and the company press release (i.e., a news agency), the similarity is also remarkable: in 14 of these 30 news briefs of La Vanguardia that rely on an agency dispatch, the agency dispatch coincides with the news brief in 10 cases. What is more, the dispatch coincides fully with the original press release. When the agency dispatch and the corresponding press release are available (14 cases), the news brief and the press release of the multinational are similar on 9 occasions.
As for the 22 news briefs analysed from El Mundo, 12 have aspects that coincide with the wording proposed by the multinational. In this newspaper, the intermediary (i.e., the news agency) intervenes most (this is the case in 21 of the 22 cases). Thus, the news brief maintains clear similarities to the agency piece. Of the 19 cases in which both the agency dispatch and the multinational's original wording are available, homogeneity between them occurs 13 times.

La Vanguardia
Using the similarity algorithm, we have compared the 30 news briefs of La Vanguardia with their corresponding press releases. The results appear in detail in Table II and are summarized graphically in Image 1. In Images 1-8, the solid line represents the threshold of similarity, above which it can be said that similarity does exist; the dashed line marks the arithmetic mean of the similarity indexes and the intensity of grey in each of the columns is proportional to the similarity coefficient. In the same way, Tables II-VII include the length of the text necessary for achieving maximum similarity and the final test result. When the Jaccard binary index is higher than 0.35, there is statistical evidence of great similarity. This is reflected in the last column of Table II.
Another way of looking at this is the following. Of the 30 pairs of texts that were compared, 19 of them obtained a similarity value higher than 0.35, which represents 63.33% of the texts. What is more, the average value of the similarity index is 0.41, which is higher than 0.35. Image 1. Jaccard binary index when comparing news briefs with press releases Source: Own elaboration With the similarity algorithm, we have compared the La Vanguardia news briefs and the corresponding agency dispatches. The results, which are summarized in Image 2, appear in detail in Table III. It can be seen that of the 14 pairs of texts compared, a similarity value greater than 0.35 is obtained in 10, which represents 71.43% of the texts. In addition, the average value of the similarity index stands at 0.46, which is higher than 0.35. Ten of the 14 news briefs coincide. In those cases, in which both the press release and the agency dispatch are available, a comparison has also been established that is summarized graphically in Image 3. In 14 of the 30 La Vanguardia news briefs, the agency dispatch has been obtained as well as the original source, i. e., the company's statement. In observing the abovementioned illustration, it is clear that in all but one case the similarity index between the news brief and agency dispatch is higher than the index between the news brief and the press release. The average value (0.33) of the similarity indices between the news brief and the press releasewhen the news agency dispatch is also available-is below the 0.35 threshold. In contrast, the average of the similarity indices between news brief and agency dispatch is, in this case, 0.46.
For La Vanguardia, when the press release and agency news release are available, the byline given to the published news brief is, in most cases, a news agency (Europa Press, EFE or even "Agencies"). The similarity algorithm compares the similarity between the agency news releases that are the source of the news briefs and the original press releases. The results are summarized in Image 4 and are broken down in Table IV. It can be seen that, out of the 14 pairs of texts compared, a similarity value that is clearly higher than 0.35 is reached in 9 of them, representing 64.28% of the texts. The average value of the similarity index is 0.41, which is higher than 0.35. Image 2. Jaccard binary index when comparing news briefs with agency dispatches Source: Own elaboration Image 3 Jaccard binary index when comparing news briefs with agency dispatches (black columns) and with press releases (red columns) Source: Own elaboration

El Mundo
In the same way that we have proceeded with La Vanguardia, we have compared the similarity between El Mundo's news briefs and the corresponding press releases. The results are summarized in Image 5 and are more greatly detailed in Table V. When Jaccard's binary index is greater than 0.35, statistical evidence of great similarity is indicated, which is reflected in the last column of Table V. In 12 of the 20 pairs of compared texts, a similarity value of more than 0.35 is reached, which represents 60% of the texts. The average value of the similarity index is 0.41, which is higher than 0.35.  Likewise, the similarity algorithm has been used to compare the similarity between El Mundo's news briefs and the corresponding dispatches or agency news releases. The results are presented as above in Image 6 and Table VI. In 20 of the 21 matched pairs, a similarity value greater than 0.35 is obtained, which represents 95.24% of the texts. The average value of the similarity index stands at 0.69, which is also higher than 0.35. The similarity established between the news brief and the agency dispatch is greater than that between the news brief and the press release.
When both the press release and the news agency dispatch were obtained, they were compared and the results are shown in Image 7. The similarity index between the news brief and agency dispatch is greater than the index between the news brief and press release in all cases but one. Image 6 Jaccard binary index when comparing the news briefs with the agency news releases Source: Own elaboration In the case of the newspaper El Mundo, the bylines carried by all but one of the news briefs analysed in this research were indicated as news agencies: EFE and in particular Europa Press.
The similarity algorithm was used to study the relationship between the agency dispatches feeding the El Mundo news briefs and the relevant press releases. The results appear in Image 8 and Table VII. The similarity value is greater than 0.35 in 13 of the 19 pairs of compared texts, which represents 68.42% of the total. The similarity index average stands at 0.47, which is again higher than 0.35. Image 7 Jaccard binary index when comparing news briefs with agency dispatches (black columns) and with press releases (red columns) Source: Own elaboration Image 8 Jaccard binary index when comparing the press releases with the agency dispatches Source: Own elaboration

Conclusions
Next, we will check if there are significant differences between both media in terms of similarity indexes. Table VIII includes a summary indicating the arithmetic mean and the standard deviation for the similarity indexes between the news briefs and press releases (PR) and between the news briefs and agency news releases (AD) in both newspapers. The similarity indices can be viewed separately in Images 9 and 10, which, respectively, were de-signed for the press releases and agency dispatches.
Image 9. Similarity indices between news brief and press release in La Vanguardia (black dots) and El Mundo (red diamonds). The dashed black and red lines (effectively, superimposed over each other) represent the average values in each case These visual representations show that the similarity indices between the news briefs and the press releases of La Vanguardia and El Mundo are close, although in El Mundo the dispersion of the data is somewhat greater. There is a noticeable difference in the similarity indices between the news briefs and the agency dispatches. The average value is 0.46 in La Vanguardia, while the value increases to 0.69 in El Mundo. A hypothesis test will be carried out to confirm or discard the insight derived from this difference.
Testing the hypothesis allows us to rigorously establish whether-according to scientific criteria -there are significant differences between La Vanguardia and El Mundo in regard to their respective similarity indices between their news briefs and the agency news releases. Our hypothesis testing relied on the so-called Welch-Satterthwaite test, which is summarized in the work of Mújica et al. (2014). The null hypothesis (H0) was that no differences existed in regard to the distribution of both sets, while the alternative hypothesis (H1) established that both sets were distributed differently. To test the hypothesis, a significance level of 10% was established and the free program R was used.
For the abovementioned data set, the p-value of the test is equal to 0.069%, which is clearly less than the 10% level of significance, meaning that the differences between both data sets are statistically and significantly different, such that it can be inferred that the similarities between news briefs and agency dispatches are greater in El Mundo than in La Vanguardia.
In 20 of the 52 studied news briefs, the byline given is "Editorial"; 35 of the 52 news briefs reproduce agency dispatches; 32 of these news briefs do attribute the source to the agencies (El Mundo surpasses La Vanguardia in this aspect: only one of the 22 studied news briefs carried the byline Editorial Board). Thus, most of the bylines in the news brief are agency (32). In these cases, the news agency has mostly copied the original press release. Twenty-one of these 32 news briefs exhibit similarity with the corresponding press releases, which were presumably prepared by the news agencies (Europa Press, mostly). In the 20 cases that remain out of the 52 news briefs, the bylines of the news briefs correspond to the Editorial board.
In 14 of the 19 news briefs giving the byline to the editors in the La Vanguardia Economy section, there is similarity with the original press release. What is more, 3 of these 14 news briefs are actually from the agency. As far as El Mundo is concerned, only 1 of the studied news briefs gave the byline to the editorial board, although this piece presents striking similarities to the original press release.
The interviewees coincide in their interpretation of these data. Those who work in the sector have become familiar with this modus operandi, and hardly anyone is surprised anymore that the media industry works in this way. The mechanics of this process make it fast and cheap: It requires little manpower, which in addition does not require great training and skill to function in this manner.
As has happened historically (McCombs andShaw, 1972, Manning, 2001), companies insist that their press offices and communications departments guarantee a certain presence in the media. For this reason, these professionals have learned to present their information in a way that leads journalists to believe the material needs no rewording, as the interviewees recall. They further add that the agencies are in a hurry to be the first to distribute any news, which presumably tempts them to copy and paste the press releases. As the participants in this investigation emphasize, journalists follow similar behaviour when these dispatches reach their newsrooms. Thus, the interviewees conclude that the press release-despite having passed through two newsrooms (the agency and the intermediary)-undergoes virtually no alteration. Image 10 Similarity indices between news brief and agency dispatch in La Vanguardia (black points) and El Mundo (red diamonds). The dashed black and red lines represent the average values in each case