Methods of sociolinguistic lexicometry: analysis of the contemporary oral corpus PRESEEA-Santander
Abstract
Lexicometry is a method that allows us to identify thematic units derived from the automatic extraction of knowledge patterns in data of a textual nature (Romero, Alarcón and García, 2018). From its application, the lexical tendencies of a corpus emerge through the quantification of the occurrence of words. The different sociolinguistic lexical styles have been studied in wide varieties of the world's languages, including the Spanish language. However, in the studies available to us to date, there are not enough quantitative analyzes of the lexicon of a contemporary oral sociolinguistic corpus. The general objective of this article is to detect the preferences for the use of the vocabulary of the spoken Spanish language within the framework of sociolinguistic lexicometry. To do this, a representative sample of a corpus with stratification in three variables (sex, age, educational level) was analyzed. This sample belongs to the PRESEEA-Santander corpus, framed in the Project for the Sociolinguistic Study of Spanish in Spain and America (Moreno Fernández, 2021). The LYNEAL system (Letters and Numbers in Linguistic Analysis) (Ueda, 2021) was used in the analysis, as well as the open-source statistical software R. The results indicate that gender is revealed as an important variable in the process of lexical variation, detecting, among other findings, the use of nominal over verbal style and the preferential use of adverbs in -mente by men; with respect to age, the tendency to use lexical truncation in the younger generation and in the female gender is noted; finally, the concentration of use of muchísimo in women, young people, with a primary education level, is appreciated.
Downloads
Article download
License
In order to support the global exchange of knowledge, the journal Círculo de Lingüística Aplicada a la Comunicación is allowing unrestricted access to its content as from its publication in this electronic edition, and as such it is an open-access journal. The originals published in this journal are the property of the Complutense University of Madrid and any reproduction thereof in full or in part must cite the source. All content is distributed under a Creative Commons Attribution 4.0 use and distribution licence (CC BY 4.0). This circumstance must be expressly stated in these terms where necessary. You can view the summary and the complete legal text of the licence.