Chao_translation_studies
– I started the two-week countdown for my academic activities in applied linguistics, particularly translation studies. A few days ago, I uploaded my last two first-author papers to arXiv. Both deal with information-theoretic approaches to translationese. One introduces the surprisal-index corpus
EPIC-EuroParl-UdS.
The other paper reports empirical results showing that information-theoretic indicators of source difficulty and cross-lingual transfer difficulty can explain part of the variation in translationese. The explanatory power of the model reaches R2 = 0.21.
Three other results stand out:
-
Accuracy–fluency trade-off.
The hypothesis—operationalised as a negative correlation between MT surprisal and target GPT-2 surprisal—holds up to about 11 bits of MT surprisal per word in a segment. Beyond that point, the correlation turns positive. -
Transfer vs. source difficulty.
Transfer difficulty is generally more predictive of translationese than source difficulty. The exception is German → English, where understanding the source (especially in simultaneous interpreting) appears to be about as important as transfer difficulty. -
Spoken vs. written asymmetry.
There is a striking difference between modes: in spoken translation, the more difficult the task, the less translationese appears in the output.