Two CTTS/CNGL Successes for AMTA ’12

Two CTTS/CNGL papers have been accepted to the Tenth Biennial Conference of the
Association for Machine Translation in the Americas, which will be held in San Diego, California, October 28 through Thursday, November 1, 2012.


A User-Based Usability Assessment of Raw Machine Translated Technical Instructions

It is generally agreed that, in commercial contexts, machine translation output needs to be post edited in order to be acceptable to, and usable by, end-users. There are relatively few studies of the usability of raw (non-post-edited) machine translated documentation by real end-users. This paper reports on a project whose aims are to investigate the usability of raw machine translated technical support documentation for a commercial online service. A non-domain specific freely available machine translation system was used to translate the documentation. Following the ISO definition, usability is understood as “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use”.  In keeping with this definition, the measures of usability in this study are:

  • Goal completion
  • Satisfaction
  • Effectiveness
  • Efficiency

Comparisons are drawn for all measures between the original user documentation written in English for a well-known online file storing service and raw machine translated output in four target languages, Spanish, French, German and Japanese. The documentation contained instructions for the completion of 6 main tasks, each of which had several sub-tasks. Usability measurements for the native speakers of English (Group 1, n=15)) are compared against measurements for the native speakers of the four target languages (Group 2, n=14). The null hypothesis is that there will be no differences in usability measurements between Group 1 and Group 2. For goal completion the results showed no significant difference between groups, where Group 1 had slightly higher ratings. Differences were observed at sub-task level, which account for Group 1’s higher score.

A significant difference was found between the two groups for ratings of comprehensibility, satisfaction, and when asked whether users would recommend the software to a friend/colleague; however, no significant difference was found between groups when they were asked to rate their own success for task completion and, interestingly, when they were asked whether the instructions could be improved upon. Similarly, no significant difference was found between the groups when they were asked whether they would be able to use the software again in the future without instructions. For effectiveness the results showed that there were no significant differences between the two groups. Finally, for the efficiency measurement the results showed a significant difference between groups, where Group 1 was found to be more efficient.

We can conclude from these measurements that while the machine translated versions of the source text were functionally acceptable to users, preference for the source text was clearly expressed in terms of user satisfaction and comprehensibility, and demonstrated in the higher rate of efficiency for these users. This study also recorded usability measurements in the form of eye tracking data on fixation counts and average fixation duration and these will be reported in the near future. In addition, differences between languages will also be examined in order to observe intra-group trends.


Taking Statistical Machine Translation to the Student Translator

Statistical Machine Translation (SMT) is based on an intuitively simple strategy: rather than work out how to translate from one language to another, try to learn from what human translators have already done. But despite the simplicity of the idea and the fact that SMT actually uses human translations as data, SMT quickly becomes difficult for translators to understand, given the complexity of the statistical models it uses in training, and the nature of the algorithms it uses to generate the most likely translation at runtime. This is disempowering for human translators, as they are not generally in a position to contribute to the development of such systems, or to their introduction in translation workflows. What’s more, they can find themselves confined to reactive ‘after-the-event’ roles in SMT (e.g. post-editing), and excluded from the proactive, holistic roles that many translators commonly adopt in their professional lives. Such scenarios are uncomfortable for those who educate translators: while we want our students to be well versed in the use of contemporary technologies, we quite clearly do not want them to be forced into constricted, disempowering roles.

At the same time, we are convinced that some translators stand to gain considerably from the use of SMT, while developers of SMT systems also stand to benefit from a greater uptake of the technology by translators, and their insights and experiences as end users. Like many sociologists of technology, we take the view that markets for technologies are actively constructed, and we acknowledge the role that both the vendors of technologies and educators play in such market construction. What is currently missing, however, is a syllabus that educators can use to teach translation students about SMT, in a way that empowers rather than instrumentalizes them in SMT workflows. Such a syllabus would include both theoretical components tailored to meet the needs of students who are not majoring in computer science, and practical components in which students learn: how to train an SMT system using trusted data; how to improve system performance; how to evaluate SMT output; etc.

In this paper we present first results from a combined teaching and research project which aims to produce such a syllabus. In the project, conducted in the first half of 2012, thirty eight students taking Masters-level translation programmes at Dublin City University (DCU) used the self-service SMT package SmartMATE, developed and hosted by Applied Language Solutions, to create and optimize their own SMT systems.  SmartMATE was considered ideal for use in this experiment, as it did not require students to have the kind of programming knowledge required to install comparable ‘do-it-yourself’ systems, but it did allow them considerable freedom to build and customize their own SMT systems. Close cooperation between DCU and Applied Language Solutions also meant that students could be integrated into a feedback loop, receiving useful explanations from source if things did not run as expected, and passing on their own observations to the developers of the system.

A mixed-methods approach was taken to the research, to access rich qualitative data about the subjective experiences of the student translators, and to measure student learning using standard quantitative instruments. Data were collected using participant questionnaires (containing items accessing experience, perception of MT, and self-efficacy in the use of SMT), translator and lecturer logs, end-of-module assignments, and focus groups.  Initial results show how students’ perceptions of SMT and of their own ability to use the technology changed over the course of the project, sometimes in unexpected ways.