Conference paper
Temporal visual cues aid speech recognition
BACKGROUND: It is well known that under noisy conditions, viewing a speaker's articulatory movement aids the recognition of spoken words. Conventionally it is thought that the visual input disambiguates otherwise confusing auditory input. HYPOTHESIS: In contrast we hypothesize that it is the temporal synchronicity of the visual input that aids parsing of the auditory stream.
More specifically, we expected that purely temporal information, which does not convey information such as place of articulation may facility word recognition. METHODS: To test this prediction we used temporal features of audio to generate an artificial talking-face video and measured word recognition performance on simple monosyllabic words.
RESULTS: When presenting words together with the artificial video we find that word recognition is improved over purely auditory presentation. The effect is significant (p
Language: | English |
---|---|
Year: | 2006 |
Proceedings: | 7th Annual Meeting of the International Multisensory Research Forum |
Types: | Conference paper |