Name: | Description: | Size: | Format: | |
---|---|---|---|---|
2.05 MB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
The increasing role of Post-editing (PE) as a way of improving Machine Translation (MT) output and a faster alternative to translating from scratch among translators has lately attracted researchers’ attention. A number of recent studies have proposed various attempts to facilitate this task, especially for the outputs of Statistical
Machine Translation (SMT). However, little attention in the field has been given to Rule-based Machine Translation (RBMT).
In this dissertation an effort was made to provide support for the PE task through Error Detection (ED). A deep linguistic error analysis was done in a sample of English sentences in two text domains translated from Portuguese by two RBMT systems. The hypothesis is that automatically identifying and highlighting errors in
translations can help to perform the PE task faster, make it more efficient and less tedious.
As RBMT systems tend to make repetitive, systematic mistakes translators are forced to post-edit the same mistakes which makes their task monotonous and frustrating. In order to solve this problem, a set of 40 contrastive rules was designed tackling various linguistic phenomena on the basis of the translation errors identified in the error analysis. By applying this linguistic approach the project aimed at demonstrating that one can have a rule-based system working on the basis of designed rules which could help to detect and highlight translation errors in the RBMT output. The rules were verified by performing an experimental error analysis on a new data set whose results revealed that their coverage was 98.21%. The implementation results demonstrated a successful performance of the system. In addition, the results of a psycholinguistic experiment performed with human translators confirmed that having highlighted errors is useful as this can help translators perform the postediting task up to 12 seconds per error faster and improve their efficiency by minimizing the number of missed errors.
Description
Keywords
Error classification Error detection Error analysis Rule-based machine translation Post-editing