Untergeordnete Navigation

Der Kalender liegt auch im csv-Format und im ics/iCal-Format vor.



Februar 2017
MoDiMiDoFrSaSo
  12345
6789101112
13141516171819
20212223242526
2728     

 

Informatik-Oberseminar: Generative Training and Smoothing of Hierarchical Phrase-Based Translation Models

17.03.2017, 10:00 Uhr (Informatik-Zentrum, Raum 9222, E3, Ahornstr. 55)

Referent: Dipl.-Inform. Stephan Peitz

Abstract:

Hierarchical phrase-based translation is a common machine translation approach for translating between languages with significantly different word order. The focus of the first part of this thesis is set on smoothing and training of the translation models used in hierarchical translation. Additionally, we present an improved implementation of the search algorithm and show that our implementation is competitive compared to other state-of-the-art hierarchical phrase-based translation engines. Within the second part of this work, we apply hierarchical phrase-based translation in the context of spoken language translation. In the state-of-the-art hierarchical translation model extraction process, translation rules and their corresponding translation probabilities are obtained from word-aligned training data by applying simple heuristics. A common issue is that even if a large set of training data is provided, the resulting translation model may suffer from data sparseness. Smoothing is an approach to remedy this problem and is well-known from other natural language processing tasks (e.g. language modeling). The goal of smoothing applied in the scope of machine translation is to model rarely seen translation rules better. In this thesis, we investigate and compare different smoothing techniques for hierarchical phrase-based translation. Furthermore, the extraction and translation processes are two separated steps. Therefore, the extraction does not take into account whether the obtained translation rules are actually needed in the translation process. To learn whether a translation rule is relevant for the translation process, we pursue the approach of force-decoding the training data. Given a sentence pair of the training data, the translation of the source sentence is constrained to produce the corresponding target sentence. The applied translation rules are then determined and the corresponding translation probabilities re-estimated. In order to be able to translate a large set of training data, an efficient and fast framework is needed. In this work, we introduce such a framework for re-estimating hierarchical translation models. This approach enables us to obtain smaller translation models while simultaneously improving the translation quality. We further compare our proposed scheme with another state-of-the-art translation model training approach, namely discriminative training, on a large-scale Chinese-to-English translation task. Spoken language translation is the task of translating automatically transcribed speech. Since most automatic speech recognition systems provide transcriptions without punctuation marks and case information, this information has to be re-introduced before the actual translation takes place. In this work, we show that performing punctuation prediction and re-casing by applying a machine translation system helps to improve the translation quality. In particular, we propose to apply hierarchical translation rather than phrase-based translation for this task. Finally, experiments were conducted on a large-scale English-to-French spoken language translation task. All methods described in this thesis have been made freely available to the research community as they were integrated into the open-source translation toolkit JANE.

Es laden ein: Die Dozenten der Informatik

15.02.2017, sts

Parallel Programming in Computational Engineering and Science 2017 - Grundlagenworkshops

20.03.2017 - 24.03.2017 (IT Center der RWTH Aachen, Kopernikusstr. 6, Aachen)

Zielgruppe: Wissenschaftler: Simulation Science mit High Performance Computing

Webseite: https://doc.itc.rwth-aachen.de/display/VE/PPCES+2017

17.02.2017, sts

25th VI-HPS Tuning Workshop ( RWTH Aachen) - Aufbauworkshop

27.03.2017 - 31.03.2017 (IT Center der RWTH Aachen, Kopernikusstr. 6, Aachen)

Zielgruppe: Wissenschaftler: Simulation Science mit High Performance Computing

Webseite: http://www.vi-hps.org/training/tws/tw25.html

17.02.2017, sts

Cyberkriminelle und ihre Tricks

24.05.2017, 17:00 Uhr (Generali-Saal, Super C, RWTH)

Vortrag von Prof. Felix Freiling

Die Veranstaltung wird organisiert durch die Fachgruppe Informatik, REGINA e. V., RIA/Gesellschaft für Informatik und ist kostenfrei; eine Anmeldung nicht erforderlich.

Weitere Informationen:

Helen M. Bolke-Hermanns, Telefon: 0241 80 21004, E-Mail: helen.bolke-hermanns@informatik.rwth-aachen.de

20.02.2017, sts

#Jodel – made in Aachen

30.05.2017, 16:00 Uhr (H03, CARL, RWTH Aachen)

Ein Vortrag von Allesio Borgmeyer.

Alessio Avellan Borgmeyer ist der Erfinder der App, die vor zwei Jahren in Aachen gelauncht wurde. Der 26-Jährige hat von 2010 bis 2013 an der RWTH Aachen Wirtschaftsingenieurwesen studiert und lebt inzwischen in Berlin. Niklas Henckell und Alexander Linewitsch – beiden ebenfalls von der RWTH Aachen – sind kurz nach Gründung beigetreten. Seitdem befindet sich die App Jodel auf Erfolgskurs. Schwierigkeiten, persönliche Erfahrungen und Erlebnisse werden Themen bei der Veranstaltung sein sowie die neue Form der Kommunikation über Jodel.

Live-Jodel wird geschaltet.

#entrepreneurship
#HoehenundTiefenderletzenJahre
#AachenJodelhauptstadt
#lebedeinenTraum
#malinsCARL
#lifehack

Die Veranstaltung wird organisiert durch die Fachgruppe Informatik, REGINA e. V., RIA/Gesellschaft für Informatik und ist kostenfrei; eine Anmeldung nicht erforderlich.

Weitere Informationen:

Helen M. Bolke-Hermanns, Telefon: 0241 80 21004, E-Mail: helen.bolke-hermanns@informatik.rwth-aachen.de

20.02.2017, sts