Automatic closed captions and subtitles in academic video presentations: possibilities and shortcomings
Abstract
In light of the increasing number of academic events being recorded or held online since the onset of the COVID-19 pandemic, the present work combines automation processes in audiovisual translation and academic texts–more specifically, video presentations. The research questions are whether the automatic generation of captions is functional to ensure accessibility in academic events and how much post-editing effort would such content require in case a machine translation of the subtitles is to be applied. The research method comprises several phases. First, in a corpus of video presentations of specialised content in English, captions were generated automatically using YouTube Studio to ascertain the general quality and the type of errors generated in the automatically generated closed captions according to Multidimensional Quality Metrics (MQM) framework. These auto-generated captions were corrected and annotated by considering the following parameters: a) pre-editing time, b) type of error according to MQM framework, and c) severity of the error. Second, the auto-generated captions and corrected were machine translated into Spanish. Furthermore, errors detected in the machine translation of the subtitles (English-Spanish) were post-edited and errors were analysed following the MQM. Reception by a potential audience was also studied, as evaluated by academics from the same field of expertise. The main conclusion is that most errors in machine-translated subtitles stem from incorrect caption segmentation and lack of context awareness, making it essential to correct the closed captions before translation. This thesis is supported by the reception study in which the level of comprehension was higher when the transcription was pre-edited, as most of the problems arise from the closed captions rather than from the translation itself.
Downloads
Article download
License
In order to support the global exchange of knowledge, the journal Complutense Journal of English Studies is allowing unrestricted access to its content as from its publication in this electronic edition, and as such it is an open-access journal. The originals published in this journal are the property of the Complutense University of Madrid and any reproduction thereof in full or in part must cite the source. All content is distributed under a Creative Commons Attribution 4.0 use and distribution licence (CC BY 4.0). This circumstance must be expressly stated in these terms where necessary. You can view the summary and the complete legal text of the licence.