bear2107 / document-clustering-master Goto Github PK
View Code? Open in Web Editor NEWAn application that begins by gathering synopses on the top 100 films of all time and ended by analyzing the latent topics within each document. With intermediate manipulations on these synopses (tokenization, stemming, stopwords), transformed them into a vector space model (tf-idf), and clustered them into groups (k-means and hierarchical)