Summarization of Spontaneous Speech using Automatic Speech Recognition and a Speech Prosody based Tokenizer

György Szaszák, Máté Ákos Tündik, András Beke

2016

Abstract

This paper addresses speech summarization of highly spontaneous speech. The audio signal is transcribed using an Automatic Speech Recognizer, which operates at relatively high word error rates due to the complexity of the recognition task and high spontaneity of speech. An analysis is carried out to assess the propagation of speech recognition errors into syntactic parsing. We also propose an automatic, speech prosody based audio tokenization approach and compare it to human performance. The so obtained sentence-like tokens are analysed by the syntactic parser to help ranking based on thematic terms and sentence position. The thematic term is expressed in two ways: TF-IDF and Latent Semantic Indexing. The sentence scores are calculated as a linear combination of the thematic term score and a positional score. The summary is generated from the top 10 candidates. Results show that prosody based tokenization reaches human average performance and that speech recognition errors propagate moderately into syntactic parsing (POS tagging and dependency parsing). Nouns prove to be quite error resistant. Audio summarization shows 0.62 recall and 0.79 precision by an F-measure of 0.68, compared to human reference. A subjective test is also carried out on a Likert-scale. All results apply to spontaneous Hungarian.

Download


Paper Citation


in Harvard Style

Szaszák G., Tündik M. and Beke A. (2016). Summarization of Spontaneous Speech using Automatic Speech Recognition and a Speech Prosody based Tokenizer . In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016) ISBN 978-989-758-203-5, pages 221-227. DOI: 10.5220/0006044802210227

in Bibtex Style

@conference{kdir16,
author={György Szaszák and Máté Ákos Tündik and András Beke},
title={Summarization of Spontaneous Speech using Automatic Speech Recognition and a Speech Prosody based Tokenizer},
booktitle={Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)},
year={2016},
pages={221-227},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006044802210227},
isbn={978-989-758-203-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)
TI - Summarization of Spontaneous Speech using Automatic Speech Recognition and a Speech Prosody based Tokenizer
SN - 978-989-758-203-5
AU - Szaszák G.
AU - Tündik M.
AU - Beke A.
PY - 2016
SP - 221
EP - 227
DO - 10.5220/0006044802210227