B.S. (Bachelor of Science)
Department of Computer Science
This paper outlines an effective process for transcribing conversations from an audio file. The process involves combining speech recognition and speaker recognition to prepare the audio signals for transcription without relying on a database of preexisting vocal models. This process is intended for multi-speaker conversations where vocal models are not available or otherwise impossible to create from the amount of data provided. We find in conclusion that we can improve the performance of speech recognition on multi-speaker conversations by leveraging the classifying properties of speaker recognition to reduce variance in the dataset thus producing a result that is just as effective if we were to perform mono-speaker speech recognition.
Youngberg, Eric R., "Improving Speech and Speaker Recognition For Multi-Speaker Conversations" (2017). Honors Capstones. 658.
Northern Illinois University
Rights Statement 2
NIU theses are protected by copyright. They may be viewed from Huskie Commons for any purpose, but reproduction or distribution in any format is prohibited without the written permission of the authors.