Publication Date
5-5-2017
Document Type
Dissertation/Thesis
First Advisor
Alhoori, Hamed
Degree Name
B.S. (Bachelor of Science)
Legacy Department
Department of Computer Science
Abstract
This paper outlines an effective process for transcribing conversations from an audio file. The process involves combining speech recognition and speaker recognition to prepare the audio signals for transcription without relying on a database of preexisting vocal models. This process is intended for multi-speaker conversations where vocal models are not available or otherwise impossible to create from the amount of data provided. We find in conclusion that we can improve the performance of speech recognition on multi-speaker conversations by leveraging the classifying properties of speaker recognition to reduce variance in the dataset thus producing a result that is just as effective if we were to perform mono-speaker speech recognition.
Recommended Citation
Youngberg, Eric R., "Improving Speech and Speaker Recognition For Multi-Speaker Conversations" (2017). Honors Capstones. 658.
https://huskiecommons.lib.niu.edu/studentengagement-honorscapstones/658
Extent
5 pages
Language
eng
Publisher
Northern Illinois University
Rights Statement
In Copyright
Rights Statement 2
NIU theses are protected by copyright. They may be viewed from Huskie Commons for any purpose, but reproduction or distribution in any format is prohibited without the written permission of the authors.
Media Type
Text
Comments
There is an application that was built to implement the ideas in this paper hosted at https://freescribe.org