Speaking plays an indispensable role in communication. We utilize our vocal apparatus to speak through various articulatory processes including variations in the manner, place and the intonation or frequency. Apart from the inventory of sounds, we also extensively use variations in the tone to convey additional meaning(s). In this project we try to make use of the above characteristics and machine learning methodologies to understand the patterns in tonal languages. Instead of analysing the absolute pitch or frequency, we analyse how one tone transitions to another in speech. Features (namely, zero crossing count, short time energy, minimum formant frequency, maximum formant frequency) are extracted using the tonal transitions over segments of audio signals. We developed a multi-classifier system using four classifiers, namely maximum likelihood estimate (MLE), minimum distance classifier (MDC), k-nearest neighbor (kNN) classifier and fuzzy k-NN classifier to automatically identify tonal languages from audio signals. Initially, each individual classifier is trained with existing known data represented by the extracted features. The trained classifier is then used for language identification. Results obtained from these classifiers are combined to generate the final output. Experiments are conducted using three different tonal languages, namely, Chinese, Thai and Vietnamese. The output reveals that the developed multi-classifier model is able to produce promising results.
Third Award of $1,000