- AutorIn
- Daniel Quernheim
- Titel
- Bimorphism Machine Translation
- Zitierfähige Url:
- https://nbn-resolving.org/urn:nbn:de:bsz:15-qucosa-223667
- Datum der Einreichung
- 04.10.2016
- Datum der Verteidigung
- 10.04.2017
- Abstract (EN)
- The field of statistical machine translation has made tremendous progress due to the rise of statistical methods, making it possible to obtain a translation system automatically from a bilingual collection of text. Some approaches do not even need any kind of linguistic annotation, and can infer translation rules from raw, unannotated data. However, most state-of-the art systems do linguistic structure little justice, and moreover many approaches that have been put forward use ad-hoc formalisms and algorithms. This inevitably leads to duplication of effort, and a separation between theoretical researchers and practitioners. In order to remedy the lack of motivation and rigor, the contributions of this dissertation are threefold: 1. After laying out the historical background and context, as well as the mathematical and linguistic foundations, a rigorous algebraic model of machine translation is put forward. We use regular tree grammars and bimorphisms as the backbone, introducing a modular architecture that allows different input and output formalisms. 2. The challenges of implementing this bimorphism-based model in a machine translation toolkit are then described, explaining in detail the algorithms used for the core components. 3. Finally, experiments where the toolkit is applied on real-world data and used for diagnostic purposes are described. We discuss how we use exact decoding to reason about search errors and model errors in a popular machine translation toolkit, and we compare output formalisms of different generative capacity.
- Freie Schlagwörter (DE)
- Computerlinguistik, Maschinelle Übersetzung, Bimorphismen, synchrone Grammatiken, Baumautomaten, Formale Sprachen
- Freie Schlagwörter (EN)
- computational linguistics, machine translation, bimorphisms, synchronous grammars, tree automata, formal language theory
- Klassifikation (DDC)
- 004
- 410
- Normschlagwörter (GND)
- Computerlinguistik, Maschinelle Übersetzung, Automatentheorie
- GutachterIn
- Professor Dr. Andreas Maletti
- Professor Dr. Alexander Koller
- BetreuerIn
- Professor Dr. Andreas Maletti
- Den akademischen Grad verleihende / prüfende Institution
- Universität Leipzig, Leipzig
- URN Qucosa
- urn:nbn:de:bsz:15-qucosa-223667
- Veröffentlichungsdatum Qucosa
- 27.04.2017
- Dokumenttyp
- Dissertation
- Sprache des Dokumentes
- Englisch