A Systematic Study and Empirical Analysis of Lip Reading Models using Traditional and Deep Learning Algorithms

R Sangeetha; D. Malathi

doi:10.46947/joaasr412022231

Authors

R Sangeetha
D. Malathi

DOI:

https://doi.org/10.46947/joaasr412022231

Keywords:

Audio visual Automatic Speech Recognition, Automatic Lip Reading, Hidden Markov Model,Active Shape Model.

Abstract

Despite the fact that there are many applications for analyzing and recreating the audio through existing
lip movement recognition, the researchers have shown the interest in developing the automatic lip-reading
systems to achieve the increased performance. Modelling of the framework has been playing a major role in
advance yield of sequential framework. In recent years there have been lot of interest in Deep Neural Networks
(DNN) and break through results in various domains including Image Classification, Speech Recognition and
Natural Language Processing. To represents complex functions DNNs are used and also they play a vital role
in Automatic Lip Reading (ALR) systems. This paper mainly focuses on the traditional pixel, shape and mixed
feature extractions and their improved technologies for lip reading recognitions. It highlights the most
important techniques and progression from end-to-end deep learning architectures that were evolved during
the past decade. The investigation points out the voice-visual databases that are used for analyzing and train
the system with the most common words and the count of speakers and the size, length of the language and
time duration. On the flip side, ALR systems developed were compared with their old-style systems. The
statistical analysis is performed to recognize the characters or numerals and words or sentences in English and
compared their performances.