Dienstag, 21.07.2020, 16.00 Uhr
Automatic Sign Language Recognition: From Video Corpora to Gloss Sentences
- Ort: https://us02web.zoom.us/j/83598281729?pwd=c2lhdmpIU0JJeFFDcG01M0FnS0cyZz09
- Referent: Diplom-Informatiker Jens Forster
In this work, we investigate large vocabulary, automatic sign language recognition (ASLR) from single view video using hidden Markov models (HMMs) with Gaussian mixture models as state emission functions and n-gram, statistical language models. We go beyond the state-of-the-art by investigating continuous sign language instead of isolated signs and extract features and object locations from video via object tracking foregoing invasive data acquisition methods.
Overall, we present contributions in three areas. First, we introduce the large vocabulary, single view, continuous sign language corpus RWTH-PHOENIX-Weather which has been created in the context of this work. RWTH-PHOENIX-Weather is annotated in gloss notation and features several subsets usable for object tracking, single signer as well as multi signer recognition. Second, we extend an existing model-free dynamic programming tracking framework with spatial pruning and multi-pass tracking techniques. These approaches are quantitatively evaluated on hand and face location annotations of more than 140k video frames created as part of this work. Third, we investigate the impact of error propagation from object tracking and hidden Markov model (HMM) state alignment quality, among other factors, on ASLR. Methods to improve alignment quality such as non-gesture modeling are shown be effective in improving recognition results for single signer recognition. Addressing the multimodal nature of sign languages, we investigate modality combination techniques applied during decoding finding that synchronous and asynchronous combination without re-training improve recognition results in the context of single signer and multi signer recognition. All proposed modelling and recognition techniques are evaluated on publicly available, continuous German Sign Language corpora or the novel RWTH-PHOENIX-Weather corpus. In either case, we achieve either competitive results or results that clearly outperform results found in the literature at the time of writing.
Es laden ein: die Dozentinnen und Dozenten der Informatik