Improving Speech Representations and Personalized Models Using Self-Supervision

1 · Google AI Research · June 18, 2020, 8:44 p.m.
Posted by Joel Shor, Software Engineer and Oran Lang, Software Engineer, Google Research, Israel There are many tasks within speech processing that are easier to solve by having large amounts of data. For example automatic speech recognition (ASR) translates spoken audio into text. In contrast, "non-semantic" tasks focus on the aspects of human speech other than its meaning, encompassing "paralinguistic" tasks, like speech emotion recognition, as well as other kinds of tasks, such as speaker ide...