Schedule 31. Chaos Communication Congress

Version 1.7 a new dawn

lecture: Automatically Subtitling the C3

How speech processing helps the CCC subtitle project, and vice-versa.

Transcribing a talk comes relatively easy to fast typists, whereas turning a transcript into time-aligned subtitles for a video requires a much larger human effort. In contrast, speech recognition performance (especially for open-source-based solutions), is still poor on open-domain topics, but speech technology is able to align a given text to the corresponding speech with high accuracy. Let's join forces to generate superior subtitling with little effort, and to improve future open-source-based speech recognizers, at the same time!

We present the ongoing work of an student project in informatics at Universität Hamburg in which we combine the strengths of human transcription performance and automatic alignment of these transcriptions to produce high quality video subtitles.

We believe that our work can help the C3 community in generating video subtitles with less manual effort, and we hope to provide subtitles for all 31C3 talks (as long as you provide the transcriptions).

However, we're not just a service provider to the C3. There is a shortage of training material for free and open-source speech recognizers and the acoustic models they employ. Thus, we plan to prepare an aligned audio corpus of C3 talks which will help to advance open-source speech recognition.

Be a part of this by helping us with your transcriptions -- we'll repay with subtitlings and better open-source speech recognition in the future!

Info

Day: 2014-12-29
Start time: 22:00
Duration: 00:30
Room: Saal G
Track: Science
Language: en

Feedback

Click here to let us know how you liked this event.

Concurrent events

Saal 1: The Perl Jam: Exploiting a 20 Year-old Vulnerability
Saal 2: Trackography