Lip-Reading by Surveillance cameras

Publication Type:

Conference Proceedings


3rd IEEE Smart Cities Symposium, Vytisklo: CTU Publishing House, Prague (2017)


Active appearance model, hidden Markov model, lip-reading, multimodal surveillance cameras.


<p>To increase the safety of citizens a network of surveillance cameras has been installed all over the city. These cameras enable analysis of behavior of people and objects. Aggressive behavior is from nature multimodal. Microphones attached to these cameras are not able to analyze speech in a noisy environments and if the speaker is too far away. Lip-movements of a talking mouth can be recorded and understood under limited conditions. From recent progress in the area of Artificial Intelligence it can be expected that large scale lip-reading will be possible next future. In this paper we report the state of the art of lip-reading for the Dutch language. We present a prototype developed at Delft University of Technology. The model is based on the Active Appearance model and Hidden Markov models. The results of experiments with the lip-reading will be represented too. The system has been successfully applied in trains to detect aggressive acts and violence against people and material<strong>.</strong></p>