Measuring fear from voice

From vret
Jump to: navigation, search

S.L.A. Chaeron

Human Computer Interaction

Graduated: 2009

Project

Thesis title: Ranking the Level of Fear from Voice using Nominal Classification Methods

Abstract

To investigate human emotion, which is conveyed in human speech, methods which can achieve this need to be developed. One way to respond to it is to extract prosodic features from speech which are relevant in emotion research and feed this to a machine learning algorithm. This thesis discusses the encoding, decoding and inference processes of emotional sentences. These sentences were simulated by 3 actresses and 4 actors. Each person simulated 10 sentences with each sentence being performed in three emotional levels (neutral, fearful and very fearful). In total, the dataset consisted of 208 speech samples. Three decoders, two psycho-therapist and one speech-language pathologist rated each sample into the three emotional levels. Out of the three, two were highly correlated when the samples were presented randomly. Their ratings were used to code the samples. The formants, F1, F2, F3, F4, F0 (fundamental frequency) and the intensity were extracted from each sample and were given as input to a machine learning algorithm called Support Vector Machine (SVM). Due to the ordinal nature of the samples, SVM was used as the base learner of a meta classifier called OrdinalClassClassifier which exploits the ordering properties of the attributes. This algorithm served as a classifier analyzing the data from the samples. The results showed that the performance of the combination SVM and OrdinalClassClassifier was not significantly better than when the classifier SVM was only used. After doing a selection process with six different features, intensity and F0 were found to be the most relevant.