Feature and Decision Level Audio-visual Data Fusion in Emotion Recognition Problem

Maxim Sidorov; Evgenii Sopov; Ilia Ivanov; Wolfgang Minker

doi:10.5220/0005527002460251

Feature and Decision Level Audio-visual Data Fusion in Emotion Recognition Problem

Maxim Sidorov, Evgenii Sopov, Ilia Ivanov, Wolfgang Minker

2015

Abstract

The speech-based emotion recognition problem has already been investigated by many authors, and reasonable results have been achieved. This article focuses on applying audio-visual data fusion approach to emotion recognition. Two state-of-the-art classification algorithms were applied to one audio and three visual feature datasets. Feature level data fusion was applied to build a multimodal emotion classification system, which helped increase emotion classification accuracy by 4% compared to the best accuracy achieved by unimodal systems. The class precisions achieved by applying algorithms on unimodal and multimodal datasets helped to reveal that different data-classifier combinations are good at recognizing certain emotions. These data-classifier combinations were fused on the decision level using several approaches, which still helped increase the accuracy by 3% compared to the best accuracy achieved by feature level fusion.

Download

Paper Citation

in Harvard Style

Sidorov M., Sopov E., Ivanov I. and Minker W. (2015). Feature and Decision Level Audio-visual Data Fusion in Emotion Recognition Problem . In Proceedings of the 12th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO, ISBN 978-989-758-123-6, pages 246-251. DOI: 10.5220/0005527002460251

in Bibtex Style

@conference{icinco15,
author={Maxim Sidorov and Evgenii Sopov and Ilia Ivanov and Wolfgang Minker},
title={Feature and Decision Level Audio-visual Data Fusion in Emotion Recognition Problem},
booktitle={Proceedings of the 12th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,},
year={2015},
pages={246-251},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005527002460251},
isbn={978-989-758-123-6},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,
TI - Feature and Decision Level Audio-visual Data Fusion in Emotion Recognition Problem
SN - 978-989-758-123-6
AU - Sidorov M.
AU - Sopov E.
AU - Ivanov I.
AU - Minker W.
PY - 2015
SP - 246
EP - 251
DO - 10.5220/0005527002460251