Topic Modeling with Word Embeddings

No Thumbnail Available

Date

2021

Journal Title

Journal ISSN

Volume Title

Publisher

université Ghardaia

Abstract

With the great development in the field of digitization, the extraction of topics through information that is in the form of unmarked texts, is not an easy matter. Therefore, we need a topic modeling technique, which is based on unsupervised algorithms. In our thesis, we clarify the concept of topic modeling and the inherent approaches, such as Latent Dirichlet Allocation (LDA), Embedded Topic Model (ETM), Gaussian LDA (G-LDA), and LDA with Word2Vec (LDA2Vec). In the experimental work, we make an empirical comparison between both LDA and ETM methods on the 20 newsgroups, in terms of runtime and topic coherence. The results are in favor of the ETM method

Description

Keywords

topic modeling, topic coherence, Latent Dirichlet Allocation (LDA), Embedded Topic Model (ETM), Gaussian LDA (G-LDA), LDA2Vec.

Citation

Endorsement

Review

Supplemented By

Referenced By