The Dublin Computational Linguistic Research Seminar Series

The next DCLRS talk will be given by Stephen Doherty from the CTTS on the topic of the effects of controlled language on the readability and comprehensibility of machine translation output (title and abstract below). All are welcome to attend, 4pm in L2.21, School of Computing, DCU.

Investigating the Effects of Controlled Language on the Reading and Comprehension of Machine Translated Texts: A Mixed-Methods Approach using Eye Tracking

This study investigates whether the use of controlled language (CL) improves the readability and comprehension of technical support documentation produced by a statistical machine translation system. Readability is operationalised here as the extent to which a text can be easily read in terms of formal linguistic elements; while comprehensibility is defined as how easily a text’s content can be understood by the reader.

A biphasic mixed-methods triangulation approach is taken, in which a number of quantitative and qualitative evaluation methods are combined. These include: eye tracking, automatic evaluation metrics (AEMs), retrospective interviews, human evaluations, memory recall testing, and readability indices.  A further aim of the research is to investigate what, if any, correlations exist between the various metrics used, and to explore the cognitive framework of the evaluation process.

The research finds that the use of CL input results in significantly higher scores for items recalled by participants, and for several of the eye tracking metrics: fixation count, fixation length, and regressions. However, the findings show slight insignificant increases for readability indices and human evaluations, and slight insignificant decreases for AEMs. Several significant correlations between the above metrics are identified as well as predictors of readability and comprehensibility.