M.Sc. Tezi Görüntüleme

Student: Sefa YAY
Supervisor: Dr. Öğr. Üyesi Tolga BERBER
Department: İstatistik ve Bilgisayar Bilimleri (İstatistik)
Institution: Graduate School of Natural and Applied Sciences
University: Karadeniz Technical University Turkey
Title of the Thesis: TOPIC MODELING WITH LDA AND NMF IN ENGLISH NEWS TEXTS: THE CASE OF TURKEY AND GREECE
Level: M.Sc.
Acceptance Date: 4/3/2022
Number of Pages: 60
Registration Number: i3988
Summary:

      Every dataset containing text such as text mining, news analysis, e-mail and spam filtering, topic extraction from web pages, blogs, movie summaries, and lyrics is an application field for text mining. This enables applications to be extracted from large text stores thanks to applications in many areas. Topic modeling is a natural language processing technique used to discover hidden semantic structures of text in a document collection. Within the scope of this thesis, automatic subject modeling has been made, where we can separate the news texts for Turkey and Greece according to their subjects. For this, English news texts obtained from NewsAPI news data site were automatically analyzed using Latent Dirichlet Allocation and Non-Negative Matrix Factorization methods. Also, comparison of the two methods is provided. When the issues as a result of the analysis for Turkey are examined, it is seen that foreign relations is a predominantly political agenda. In the analyzes for Greece, it has been determined that the only political agenda is between Greece and Turkey. In the results of both algorithms, it was determined that different aspects of the pandemic constitute the majority. Thus, in text mining, previously unknown and potentially needed information has been extracted from large text-containing data sources.Keywords: Topic Modeling, Text Mining, LDA, NMF, Data Mining