NCLT/CNGL Seminar Series: Thursday, Aug 23rd at 3pm: Mikel L. Forcada

All are welcome to attend. The talk is entitled: Edit hints: using machine translation to suggest the target-side words to change in computer-aided translation proposals, and will take place on Thursday, August 23rd, at 3pm, in L2.21, School of Computing, Dublin City University.


I will show how machine translation (MT) may be used to help users of computer-aided translation systems based on translation memory to identify the target words in the translation proposals that need to be changed or kept unedited. The machine translation system is used as a black box to obtain a set of features for each target word in the translation proposals and then used by a binary classifier to determine the target words to change or keep unedited (no MT output is presented to the translator). Experiments conducted in the translation of Spanish texts into English with different corpora and a machine translation system still in development shows an accuracy above 96% for fuzzy-match scores above 70%. Results show that the parameters of the binary classifier are basically domain-independent. A comparison of this technique with a previously reported technique based on statistical word alignment shows that the accuracy of both approaches is quite similar when translating in-domain texts, whereas for out-of-domain texts  the new MT-based approach achieves higher accuracy. The generalization of these techniques to obtain word-position alignments, which is currently being explored, is briefly described.