In this project, "Community Detection from Research Articles", the task then becomes detecting research papers which belong to a common field of research. The project contains following folders:
aan : This folder consists of the AAN (ACL Anthology Network) dataset. We worked on the 2013 release of the dataset.
aan_small : This folder consists of the subset of the AAN dataset. It consists of ~2000 nodes.
algorithms: This folder contains code for the different algorithms used. cosine-kmeans jaccard-kmeans louvain newman-girvan newman-girvan-v2 (using lib)
metrics: This folder contains code for the community detection using different metrics.
authorCitation: This metric uses the author citation network to detect communities. We have used louvain and newman-girvan algorithms for this. How to run: Refer Community-Detection/metrics/authorCitation/README.md paperCitaion: This metric uses the paper citation network to detect communities. We have used louvain and newman-girvan algorithms for this. How to run: Refer Community-Detection/metrics/paperCitation/README.md title: This folder contains the code for running K-means algorithm (using Jaccard and Cosine) on the title metric. How to run: Refer title/README.md year: This folder contains the code for running K-means algorithm (using Jaccard and Cosine) on the year metric. How to run: Refer year/README.md
outputs : This folder contains the outputs from various algorithms along with their respective graphs.
cleanup.sh: used to remove any *.pyc files from project.
CommunityDetectionReport.pdf: This is the project report