Comparison between LSA-LDA-Lexical Chains

Costin Chiru, Traian Rebedea, Silvia Ciotec

2014

Abstract

This paper presents an analysis of three techniques used for similar tasks, especially related to semantics, in Natural Language Processing (NLP): Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA) and lexical chains. These techniques were evaluated and compared on two different corpora in order to highlight the similarities and differences between them from a semantic analysis viewpoint. The first corpus consisted of four Wikipedia articles on different topics, while the second one consisted of 35 online chat conversations between 4-12 participants debating four imposed topics (forum, chat, blog and wikis). The study focuses on finding similarities and differences between the outcomes of the three methods from a semantic analysis point of view, by computing quantitative factors such as correlations, degree of coverage of the resulting topics, etc. Using corpora from different types of discourse and quantitative factors that are task-independent allows us to prove that although LSA and LDA provide similar results, the results of lexical chaining are not very correlated with neither the ones of LSA or LDA, therefore lexical chains might be used complementary to LSA or LDA when performing semantic analysis for various NLP applications.

Download


Paper Citation


in Harvard Style

Chiru C., Rebedea T. and Ciotec S. (2014). Comparison between LSA-LDA-Lexical Chains . In Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 2: WEBIST, ISBN 978-989-758-024-6, pages 255-262. DOI: 10.5220/0004798102550262

in Bibtex Style

@conference{webist14,
author={Costin Chiru and Traian Rebedea and Silvia Ciotec},
title={Comparison between LSA-LDA-Lexical Chains},
booktitle={Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,},
year={2014},
pages={255-262},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004798102550262},
isbn={978-989-758-024-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,
TI - Comparison between LSA-LDA-Lexical Chains
SN - 978-989-758-024-6
AU - Chiru C.
AU - Rebedea T.
AU - Ciotec S.
PY - 2014
SP - 255
EP - 262
DO - 10.5220/0004798102550262