Informatik-Kolloqium – Dr. Michael Picheny: “Speech Recognition: What's Left?”
Mittwoch, 08.05.2019, 17.15 Uhr
Informatikzentrum, Ahornstr. 55, Gebäude E3, Raum 9222
Die Veranstaltung findet im Rahmen des Informatik-Kolloquiums statt.
IEEE ComSoc Distinguished Lecturer
Dr. Michael Picheny
Senior Manager of Watson Multimodal in the Watson Group at the IBM TJ Watson Research Center
Recent speech recognition advances on the SWITCHBOARD corpus suggest that because of recent advances in Deep Learning, we now achieve Word Error Rates comparable to human listeners. Does this mean the speech recognition problem is solved and the community can move on to a different set of problems? In this talk, we examine speech recognition issues that still plague the community and compare and contrast them to what is known about human perception. We specifically highlight issues in accented speech, noisy/reverberant speech, speaking style, rapid adaptation to new domains, and multilingual speech recognition. We try to demonstrate that compared to human perception, there is still much room for improvement, so significant work in speech recognition research is still required from the community.
Michael Picheny is the Senior Manager of Watson Multimodal in the Watson Group at the IBM TJ Watson Research Center. Michael has worked in the Speech Recognition area since 1981, joining IBM after finishing his doctorate at MIT. He has been heavily involved in the development of almost all of IBM's recognition systems, ranging from the world's first real-time large vocabulary discrete system through IBM's product lines for telephony and embedded systems. He has published numerous papers in both journals and conferences on almost all aspects of speech recognition. He has received several awards from IBM for his work, including a corporate award, three outstanding Technical Achievement Awards and two Research Division Awards. He is the co-holder of over 30 patents and was named a Master Inventor by IBM in 1995 and again in 2000. Michael served as an Associate Editor of the IEEE Transactions on Acoustics, Speech, and Signal Processing from 1986-1989, was the chairman of the Speech Technical Committee of the IEEE Signal Processing Society from 2002-2004, and is a Fellow of the IEEE. He served as an Adjunct Professor in the Electrical Engineering Department of Columbia University in 2009 and 2012 and co-taught a course in speech recognition. He was a member of the board of ISCA (International Speech Communication Association) from 2005-2013 and named an ISCA Fellow in 2014. He was the co-general chair of the IEEE ASRU 2011 Workshop in Hawaii.
Activities in Michael's group currently cover a multitude of interests in the area of speech and language processing. Work in large vocabulary speech recognition includes transcription and keyword search from broadcast news and conversations across multiple languages. The group also work on developing new speech algorithms and engines for mobile and speech analytics applications. Conversational Systems work covers speech recognition and speech synthesis.
Herr Dr. Michael Picheny hält noch einen weiteren Vortrag:
Di 7.5.2019 um 14:30
Zusatzvortrag für Spracherkennungsspezialistinnen und -spezialisten, Titel “Recent Speech, Video, and Multimodal Research at IBM”
Die Vorträge von Herrn Dr. Picheny dauern jeweils etwa eine Stunde. Im Anschluss wird es genug Zeit geben, Fragen zu stellen.