Russian Sub-Word Based Speech Recognition Using Pocketsphinx Engine

Sergey Zablotskiy, Maxim Sidorov

2014

Abstract

Russian is a synthetic language with a large morpheme-per-word ratio and highly inflective nature. These two peculiarities increase the lexicon size for Russian automatic speech recognition (ASR) by tens of times in comparison to that for English covering the same out-of-vocabulary (OOV) rate. The employment of sub-word units is a widely spread state-of-the-art approach to reduce the abundant lexicon and lower the perplexity (PP) of the language model. The choice of sub-word units affects the accuracy of the entire speech recognition system, its performance as well as the complexity of the spoken phrase synthesis. Here, different recognition units are investigated using pocketsphinx-engine while recognizing the vocabulary of several million word forms. A designed text normalization approach is also briefly presented. This rule-based algorithm allows keeping diverse Russian abbreviations and numerals in the language model (LM) and avoiding the statistics distortion. The approach is directly applicable and useful for Russian text-to-speech translation as well.

Download


Paper Citation


in Harvard Style

Zablotskiy S. and Sidorov M. (2014). Russian Sub-Word Based Speech Recognition Using Pocketsphinx Engine . In Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ASAAHMI, (ICINCO 2014) ISBN 978-989-758-040-6, pages 840-844. DOI: 10.5220/0005148008400844

in Bibtex Style

@conference{asaahmi14,
author={Sergey Zablotskiy and Maxim Sidorov},
title={Russian Sub-Word Based Speech Recognition Using Pocketsphinx Engine},
booktitle={Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ASAAHMI, (ICINCO 2014)},
year={2014},
pages={840-844},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005148008400844},
isbn={978-989-758-040-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ASAAHMI, (ICINCO 2014)
TI - Russian Sub-Word Based Speech Recognition Using Pocketsphinx Engine
SN - 978-989-758-040-6
AU - Zablotskiy S.
AU - Sidorov M.
PY - 2014
SP - 840
EP - 844
DO - 10.5220/0005148008400844