Development and testing of an FPT.AI-based voicebot

Duc Chung Tran, Duc Long Nguyen, Mohd. Fadzil Hassan

Abstract


In recent years, voicebot has become a popular communication tool between humans and machines. In this paper, we will introduce our voicebot integrating text-to-speech (TTS) and speech-to-text (STT) modules provided by FPT.AI. This voicebot can be considered as a critical improvement of a typical chatbot because it can respond to human’s queries by both text and speech. FPT Open Speech, LibriSpeech datasets, and music files were used to test the accuracy and performance of the STT module. For the TTS module, it was tested by using text on news pages in both Vietnamese and English. To test the voicebot, Homestay Service topic questions and off-topic messages were input to the system. The TTS module achieved 100% accuracy in the Vietnamese text test and 72.66% accuracy in the English text test. In the STT module test, the accuracy for FPT open speech dataset (Vietnamese) is 90.51% and for LibriSpeech Dataset (English) is 0% while the accuracy in music files test is 0% for both. The voicebot achieved 100% accuracy in its test. Since the FPT.AI STT and TTS modules were developed to support only Vietnamese for dominating the Vietnam market, it is reasonable that the test with LibriSpeech Dataset resulted in 0% accuracy.


Keywords


Development; FPT.AI; Speech-to-text (STT); Text-to-speech (TTS); Vietnamese; Voicebot

Full Text:

PDF


DOI: https://doi.org/10.11591/eei.v9i6.2620

Refbacks

  • There are currently no refbacks.




Bulletin of EEI Stats