Lexical Complexity of Editorial Articles in the Spanish Press: A Selection of Four Newspapers
DOI:
https://doi.org/10.5944/rhd.vol.6.2021.30861Keywords:
Lexical Complexity, Lexical Diversity, Lexical Sophistication, NLP, Spanish PressAgencies:
Sound and Meaning in Golden Age Literature (FWF Austrian Science Fund, P32563).Abstract
This case study explores the differences in lexical complexity (LC) in the Spanish quality press. The results show variability of the lexical quality among papers and lack of correlation of number of readers and higher LC. The lexical sophistication indexes LS1 and CVS1 and lexical diversity indexes HD-D, MAAS, and MTLD were calculated for 2741 editorial articles of Abc, El Mundo, El País, and El Periódico published online in 2019. The results revealed significant differences in both LD and LS between the newspapers, with El Mundo producing the most and El Periódico the less complex texts overall. Posthoc analyses showed further differences between publications, being El Periódico the most disparate. Additionally, the comparison of HD-D, MAAS, and MTLD with TTR-based measures suggests benefits of the former for samples of heterogeneous sizes.
Downloads
References
Asociación para la Investigación de Medios de Comunicación. (2019). Ranking de diarios. Estudio General de Medios 2019 3a Ola. http://reporting.aimc.es/index.html#/main/diarios
Bathke, A. C., Friedrich, S., Pauly, M., Konietschke, F., Staffen, W., Strobl, N., and Höller, Y. (2018). Testing Mean Differences among Groups: Multivariate and Repeated Measures Analysis with Minimal Assumptions. Multivariate Behavioral Research, 53(3), 348–359. https://doi.org/10.1080/00273171.2018.1446320
David, A., Myles, F., Rogers, V., and Rule, S. (2009). Lexical development in instructed L2 learners of French: Is there a relationship with morphosyntactic development? In B. J. Richards, D. D. Malvern, M. H. Daller, P. Meara, J. Milton, and J. Treffers-Daller (Eds.), Vocabulary Studies in First and Second Language Acquisition: The Interface Between Theory and Application (pp. 147–163). Palgrave.
El Mundo. (2019, August 15). Por un turismo de mayor calidad. El Mundo. https://www.elmundo.es/opinion/2019/08/16/5d5590dffdddffa4548b45cf.html
El País. (2019, August 13). Derrota en dos tiempos. El País. https://elpais.com/elpais/2019/08/12/opinion/1565629594_068797.html
Friedrich, S., Konietschke, F., and Pauly, M. (2017). A wild bootstrap approach for nonparametric repeated measurements. Computational Statistics & Data Analysis, 113, 38–52. https://doi.org/10.1016/j.csda.2016.06.016
Friedrich, S., Konietschke, F., and Pauly, M. (2019). MANOVA.RM (Version 3.4.0) [Software]. https://cran.r-project.org/web/packages/MANOVA.RM/index.html
Friedrich, S., and Pauly, M. (2017). MATS: Inference for potentially Singular and Heteroscedastic MANOVA. Journal of Multivariate Analysis, 165, 166–179. https://doi.org/10.1016/j.jmva.2017.12.008
Imbert, G., and Vidal-Beneyto, J. (1986). El País o la referencia dominante. Mitre.
Konietschke, F., Bathke, A. C., Harrar, S. W., and Pauly, M. (2015). Parametric and nonparametric bootstrap methods for general MANOVA. Journal of Multivariate Analysis, 140, 291–301. https://doi.org/10.1016/j.jmva.2015.05.001
Laufer, B., and Nation, P. (1995). Size and Use: Lexical Richness in L2 Written Production. Applied Linguistics, 16(3), 307–322. https://doi.org/10.1093/applin/16.3.307
Linnarud, M. (1986). Lexis in composition: A performanceanalysis of Swedish learners’ written English. CWK Gleerup.
Lu, X. (2012). The relationship of lexical richness to the quality of ESL learners’ oral narratives. The Modern Language Journal, 96(2), 190–208. https://doi.org/10.1111/j.1540-4781.2011.01232_1.x
Lu, X., Gamson, D. A., and Eckert, S. A. (2014). Lexical difficulty and diversity of American elementary school reading textbooks. International Journal of Corpus Linguistics, 19(1), 94–117. https://doi.org/10.1075/ijcl.19.1.04lu
Maas, H.-D. (1972). Zusammenhang zwischen Wortschatzumfang und Länge eines Textes. Zeitschrift Für Literaturwissenschaft Und Linguistik, 8, 73–79.
MacWhinney, B. (2019). Tools for Analyzing Talk, Part 2: The CLAN Program. Carnegie Mellon University. https://talkbank.org/manuals/CLAN.pdf
Mair, P., and Wilcox, R. R. (2019). Robust Statistical Methods in R Using the WRS2 Package. Behavior Research Methods. https://doi.org/10.3758/s13428-019-01246-w
Malvern, D. D., Richards, B. J., Chipere, N., and Durán, P. (2004). Lexical Diversity and Language Development: Quantification and Assessment. Springer.
Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., and McClosky, D. (2014). The Stanford CoreNLP Natural Language Processing Toolkit. Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 55–60. https://doi.org/10.3115/v1/P14-5010
Martínez Alonso, H., and Zeman, D. (2017). UD Spanish AnCora (Version 2.0) [Software]. https://universaldependencies.org/treebanks/es_ancora/index.html
McCarthy, P. M. (2005). An Assessment of the Range and Usefulness of Lexical Diversity Measures and the Potential of the Measure of Textual, Lexical Diversity (MTLD) [Tesis doctoral, University of Memphis]. https://search.proquest.com/openview/860b2901fa90c6e68e46cd9111bd2d1c/1?pq-origsite=gscholar&cbl=18750&diss=y
McCarthy, P. M., and Jarvis, S. (2007). vocd: A theoretical and empirical evaluation. Language Testing, 24(4), 459–488. https://doi.org/10.1177/0265532207080767
McCarthy, P. M., and Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392. https://doi.org/10.3758/BRM.42.2.381
Qi, P., Dozat, T., Zhang, Y., and Manning, C. D. (2018). Universal Dependency Parsing from Scratch. Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 160–170. https://doi.org/10.18653/v1/K18-2016
R Core Team. (2019). R: A Language and Environment for Statistical Computing (Version 3.6.2) [Software]. R Foundation for Statistical Computing. https://www.R-project.org/
Real Academia Española. (2018). Banco de datos (CORPES XXI). Corpus del Español del Siglo XXI (CORPES). https://www.rae.es/recursos/banco-de-datos/corpes-xxi
Shen Yan Shun, L. (2018). lexicalrichness: A small module to compute textual lexical richness (Version 0.1.3) [Software]. https://github.com/LSYS/lexicalrichness
SIGNLL. (2018). CoNLL 2018 Shared Task. SIGNLL: ACL’s Special Interest Group on Natural Language Learning. https://universaldependencies.org/conll18/
Stanford NLP Group. (2018). System Performance. StandfordNLP. https://stanfordnlp.github.io/stanfordnlp/performance.html
Templin, M. C. (1957). Certain language skills in children. University of Minnesota Press.
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer. http://ggplot2.org
Wolfe-Quintero, K., Inagaki, S., and Kim, H.-Y. (1998). Second language development in writing: Measures of fluency, accuracy, and complexity. Univof Hawai’i Press.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.