AN AUDIO-VISUAL SPEECH RECOGNITION SYSTEM FOR TESTING NEW AUDIO-VISUAL DATABASES

Tsang-Long Pao, Wen-Yuan Liao

2006

Abstract

For past several decades, visual speech signal processing has been an attractive research topic for overcoming certain audio-only recognition problems. In recent years, there have been many automatic speech-reading systems proposed that combine audio and visual speech features. For all such systems, the objective of these audio-visual speech recognizers is to improve recognition accuracy, particularly in the difficult condition. In this paper, we will focus on visual feature extraction for the audio-visual recognition. We create a new audio-visual database which was recorded in two languages, English and Mandarin. The audio-visual recognition consists of two main steps, the feature extraction and recognition.We extract the visual motion feature of the lip using the front end processing. The Hidden Markov model (HMM) is used for the audio-visual speech recognition. We will describe our audio-visual database and use this database in our proposed system, with some preliminary experiments.

Download


Paper Citation


in Harvard Style

Pao T. and Liao W. (2006). AN AUDIO-VISUAL SPEECH RECOGNITION SYSTEM FOR TESTING NEW AUDIO-VISUAL DATABASES . In Proceedings of the First International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, ISBN 972-8865-40-6, pages 192-196. DOI: 10.5220/0001369101920196

in Bibtex Style

@conference{visapp06,
author={Tsang-Long Pao and Wen-Yuan Liao},
title={AN AUDIO-VISUAL SPEECH RECOGNITION SYSTEM FOR TESTING NEW AUDIO-VISUAL DATABASES},
booktitle={Proceedings of the First International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP,},
year={2006},
pages={192-196},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001369101920196},
isbn={972-8865-40-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the First International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP,
TI - AN AUDIO-VISUAL SPEECH RECOGNITION SYSTEM FOR TESTING NEW AUDIO-VISUAL DATABASES
SN - 972-8865-40-6
AU - Pao T.
AU - Liao W.
PY - 2006
SP - 192
EP - 196
DO - 10.5220/0001369101920196