Maria Kunilovskaya
linguistics, contrastive and computational

I am currently a postdoc with University of Saarland (Germany) working on modelling mediated language to explore the memory-surprisal trade-off hypothesis from information theory. My PhD (completed March 2023, supervisor: Prof. Mitkov, UK) was on human translation quality estimation. A lot of my efforts were invested in building learner parallel and comparable corpora.
Before that I held an Associate Professor position at a Translation Studies department, lecturing in Translation Studies, Theoretical Linguistics and Corpus Linguistics. I have a PhD (Candidate of Science) in Contrastive Linguistics (completed 2004, adviser: Prof. Brodovich, Saint Petersburg University).
My research interests have shifted from corpus- and feature-based approaches to machine learning, language modelling and representation learning. In the past few years, I was involved in several computational humanities projects, especially focused on the propaganda and social media analysis.
Keywords:
- language modelling, information theory
- Python, machine learning, distributional semantics
- computational humanities, data collection and analysis
- translation quality estimation, data annotation
- languages varieties, register studies, text complexity
Download curriculum vitae, publications (2017-2025)
recent news
Jun 11, 2025 | – I am happy to have established a new promising collaboration on spoken data/interpreting analysis, which has now yielded a paper accepted to Disfluency in Spontaneous Speech (DISS-2025, Lisbon, 4-5 September 2025), a satellite event of Interspeech 2025. |
---|---|
May 13, 2025 | – This semester, except for my own seminar on translation quality, I also co-teach P4: Abschlusskolloquium for BA Language Science with Annemarie Verkerk. Today I talked about structuring Related, reference managers and note-taking tools. This is a truly rewarding experience I must say. Not only you get to understand things in more depth and realise that you actually have a lot to share, but also it feels like the students were excited about these instruments. I was offered to teach this course alone next semester. |
May 9, 2025 | – I had a throw-back to conferences that accept abstracts and issue certificates of attendance. But I am proud to have implemented a new approach to modelling for that talk: I used GLM with Negative Binomial family for counts of disfluencies in our data, comparing the explanatory and predictive power of corpus measures of complexity vs surprisal from off-the-shelf and domain-adapted GPT2 and MarianMT models. It feels like an achievement. |
Apr 7, 2025 | Back to regular teaching! This semester (SoSe-2025), I volunteered to offer a research seminar Quality in Human and Machine Translation (QH&MT) at the Language Science and Technology Department, University of Saarland. The seminar looks into the properties of MT, especially with regard to how it compares to human translation. It is designed to bring together the linguistic expertise on, and the technological aspects/issues of measuring, quality. We will look into (i) the theoretical pre-requisites of translation quality, (ii) compare approaches applied to humans and machines, and (iii) overview the best practices in manual as well as automatic quality annotation. The proposed research topics include linguistic studies based on comparative-contrastive analysis, developing TQ test sets, investigating existing metrics and designing new methods, tweaking MT and MT quality models to capture specific errors or address specified aspects of production. I invite computationally-minded linguists and NLP students who are curious whether today’s technology is a real competition to human translators, and what nuances there are to this comparison. We start next Monday, 14 April 2025, at 16.15 (Gebäude C7 2 - Seminarraum -1.05). |
Feb 15, 2025 | – I have three (sic!) posters as the 1st author at an SBF1102-organised RAILS conference. Overachiever, ahem. Slavic intercomprehension, translation task difficulty, cognitive load factors in interpreting |
Feb 4, 2025 | (1) Had a throwback to the best part of my past life, when I gave a 90 min lecture as part of BA Vorlesung Perspektiven der Linguistik. Oh my, I miss that! (handout) (2) On the same day, 15 min after the lecture, I had to take the spoken part of the exam at German B2 level. That went surprisingly well. |
Jan 24, 2025 | – I am proud to be named an outstanding reviewer by COLING-2025 organizers: see a picture |
Dec 9, 2024 | – I am going north-east: (1) A paper produced in collaboration with C4 is accepted for NoDaLiDa 2025 to be presented in early March in Tallin (Estonia). Title: Predictability of Microsyntactic Units across Slavic Languages: A Translation-based Study. (2) Next week (December 17, 2024), I am giving a talk at LTG research seminar (The Faculty of Mathematics and Natural Sciences, University of Oslo). It will summarise B7’s progress in applying information theory to the study of translated language. |
Jun 24, 2024 | – Koel Dutta Chowdhury, my co-author, presented our work on GPT-4 prompting for translationese reduction task at EAMT in Sheffield. See (paper, slides). |
Jun 7, 2024 | – hosted the Multilingual Modelling Workshop (MM-WS), an all-SFB event that attracted researchers interested in modelling multilingual/cross-lingual data (programme). A brief summary is here. |
May 16, 2024 | – delivered a teaching session+lab for MA Translation Science and Technology students withing Hauptseminar “Empirical Linguistics and Translatology”. The lecture introduced the students to Corpus-based Translation Studies and had a focus on “Human Translation Quality Estimation (HTQE)” (slides). The lab was a walk-through on parallel corpus building, including practical views on manual and automatic annotation as well as the link between corpus structure and the research objectives. |
Apr 12, 2024 | – gave an invited talk “Application of Information Theory in Translation(ese) Studies” (slides) for participants of the Information Theory Course. |
Feb 23, 2024 | – talked about linguistic neighbours of Luxembourgish in the looking-glass world of NMT at the 1st Roundtable on NLP for Luxembourg(ish), organised by Institute of Luxembourgish Language and Literature and the Culture & Computation Lab at the University of Luxembourg (slides). |
Feb 6, 2024 | – together with Marie Escribe delivered a 2-day training on conference setup and management via START for over 20 people |
Dec 15, 2023 | – submitted a short paper to NAACL-2024: “Prompting Large Language Models to Mitigate Translationese” |
Dec 1, 2023 | – delivered an invited talk “Can Translations Be Less Translated? Leveraging GPT Prompts to Mitigate Translationese” within Conversations Series event of Culture & Computation Lab at the University of Luxemburg |
Sep 4, 2023 | – presented two papers at the RANLP and discussed a piece of research that did not feel like a paper. I was also heavily involved in the RANLP OC. |
Jun 30, 2023 | – A paper by SFB B7 team “Simultaneous Interpreting as a Noisy Channel: How Much Information Gets Through” is accepted as a long paper to RANLP 2023. The paper is among 22% of top-scoring submissions based on the scores from three double-blind peer reviews. |
May 17, 2023 | – joined the Journal of Natural Language Engineering as an Editorial Board Member. |
May 5, 2023 | – released WarMM-2023 and presented the results of Russian media-at-wartime monitoring project at EACL workshop |
Mar 13, 2023 | – passed Viva Voce examination and in the subsequent month submitted the final version of the thesis. It is available here. |
Dec 5, 2022 | – started a postdoc position in Saarland University |
Jul 1, 2022 | – started collecting data for a computational sociology/politology project that aims to compare publications in Russian mass media and social networks to capture the interplay between propaganda and vox populi |
May 14, 2022 | – from 22 to 27 May I am attending ACL in distant mode due to a sad misunderstanding about the Ireland visa. – delivered 3-day workshop on practical skills supporting research to EMTTI students in Malaga and attended a very special, entertaining and well-organised International Workshop on Interpreting Technologies. |
Apr 16, 2022 | – two of my Master students are accepted to New Trends in Translation and Technology (NeTTT). Looking forward to this grand rehearsal of vivas.
|
Apr 5, 2022 | – summarised my research in human translation quality estimation in my annual 3-hour session for EMTTI and computational linguistics students. See slides. |
Mar 25, 2022 | – finished teaching a short training course on LaTeX, referencing and Git/GitHub for EMTTI students (see Digital Skills for Research). |
Dec 15, 2021 | – talked at AIST research conference: see a FB post about it. |
Jul 19, 2021 | – completed PGCert “Academic Practice in Higher Education”, including modules on inclusivity, future of higher education and educational theories. |
Jan 1, 2020 | – started working on a Digital Humanities project that attempts to pick up global and national cultural trends based on the analysis of cultural events announcements. |
selected publications
- Interspeech-2025Euh...where do interpreters hesitate? An information-theoretic perspective on sentence-initial filler particles in simultaneous interpretingIn Disfluency in Spontaneous Speech (DISS 2025) Sep 2025
- UM PressConfuse and Normalise: Authoritarian Propaganda in a High-Choice Media Environment during Russia’s Invasion of UkraineIn Russian Propaganda Today: Challenges, Effectiveness and Resistance 2025
- LRECLexicogrammatic Translationese across Two Targets and Competence LevelsIn Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) 2020