The PaGeS Corpus, a Parallel Corpus of the Contemporary German and Spanish Language
Abstract
The corpus PaGeS is a bilingual parallel corpus, that comprises a collection of contemporary Spanish and German texts. This paper describes the different steps in the construction of the corpus. The description includes the manual preparation process of the texts to make the documents suitable for further processing, the linguistic annotation and the manual and automatic procedure of the sentence alignment of the texts. It is dealt with the access and the visualization of the data and the different search possibilities are explained. Finally, the next steps of future work are outlined.Downloads
Article download
License
In order to support the global exchange of knowledge, the journal Revista de Filología Alemana is allowing unrestricted access to its content as from its publication in this electronic edition, and as such it is an open-access journal. The originals published in this journal are the property of the Complutense University of Madrid and any reproduction thereof in full or in part must cite the source. All content is distributed under a Creative Commons Attribution 4.0 use and distribution licence (CC BY 4.0). This circumstance must be expressly stated in these terms where necessary. You can view the summary and the complete legal text of the licence.