Topic Modeling with Word Embeddings
No Thumbnail Available
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
université Ghardaia
Abstract
With the great development in the field of digitization, the extraction of topics through
information that is in the form of unmarked texts, is not an easy matter. Therefore, we
need a topic modeling technique, which is based on unsupervised algorithms.
In our thesis, we clarify the concept of topic modeling and the inherent approaches, such as
Latent Dirichlet Allocation (LDA), Embedded Topic Model (ETM), Gaussian LDA (G-LDA),
and LDA with Word2Vec (LDA2Vec).
In the experimental work, we make an empirical comparison between both LDA and ETM
methods on the 20 newsgroups, in terms of runtime and topic coherence. The results are in
favor of the ETM method
Description
Keywords
topic modeling, topic coherence, Latent Dirichlet Allocation (LDA), Embedded Topic Model (ETM), Gaussian LDA (G-LDA), LDA2Vec.
