Real-time prosodic aspects of text generated speech

The purpose of this study was to investigate the feasibility and effectiveness of a text to speech (TTS) device that allows the user to manipulate pitch and volume as speech is being generated. This device was intended to facilitate the communicative needs of individuals with complex communication needs (CCN) as a result of acquired neurological conditions such as dysarthria. An Android touchscreen tablet with a built-in speech engine was used as the hardware for the TTS device, and a post-audio signal processing approach was utilized to program the TTS device. Results were collected in two separate phases: auditory and use-based. During the auditory phase, participants listened to audio samples from the thesis TTS device, a typical TTS device, and human speech and then rated them based on perceived affect (positive vs. negative) or intent (question vs. statements) categories. During the use-based phase, participants provided feedback about the thesis TTS device after using it to communicate with the study investigator. Although auditory phase results indicated that the thesis device was currently not as effective as human speech when communicating emotion and intent, use-based findings were more promising. Use-based results revealed that the new features the thesis TTS provided (ability to manipulate pitch and volume) were considered beneficial.

Speech therapy, Computer science