Git Product home page Git Product logo

benedekrozemberczki / labelpropagation Goto Github PK

View Code? Open in Web Editor NEW
114.0 6.0 37.0 397 KB

A NetworkX implementation of Label Propagation from a "Near Linear Time Algorithm to Detect Community Structures in Large-Scale Networks" (Physical Review E 2008).

Home Page: https://karateclub.readthedocs.io/

License: GNU General Public License v3.0

Python 100.00%
fast-greedy graph community homophily label-propagation walktrap louvain community-detection graph-partitioning modularity

labelpropagation's Introduction

Label Propagation Arxiv repo size benedekrozemberczki

A NetworkX implementation of Near Linear Time algorithm to Detect Community Structures in Large-Scale Networks (Physical Review E 2008).

Abstract

Community detection and analysis is an important methodology for understanding the organization of various real-world networks and has applications in problems as diverse as consensus formation in social communities or the identification of functional modules in biochemical networks. Currently used algorithms that identify the community structures in large-scale real-world networks require a priori information such as the number and sizes of communities or are computationally expensive. In this paper we investigate a simple label propagation algorithm that uses the network structure alone as its guide and requires neither optimization of a pre-defined objective function nor prior information about the communities. In our algorithm every node is initialized with a unique label and at every step each node adopts the label that most of its neighbors currently have. In this iterative process densely connected groups of nodes form a consensus on a unique label to form communities. We validate the algorithm by applying it to networks whose community structures are known. We also demonstrate that the algorithm takes an almost linear time and hence it is computationally less expensive than what was possible so far. .

The model is now also available in the package Karate Club.

This repository provides an implementation for Label Propagation as described in the paper:

Near linear Time Algorithm to Detect Community Structures in Large-scale Networks. Usha Nandini Raghavan, Reka Albert, Soundar Kumara. Phyical Review E, 2008. [Paper]

Requirements

The codebase is implemented in Python 3.5.2 | Anaconda 4.2.0 (64-bit). Package versions used for development are just below.

networkx          2.4
tqdm              4.28.1
numpy             1.15.4
pandas            0.23.4
jsonschema        2.6.0
python-louvain    0.11
texttable         0.15.0

Datasets

The code takes an input graph in a csv file. Every row indicates an edge between two nodes separated by a comma. The first row is a header. Nodes should be indexed starting with 0. Sample graphs for the `Facebook Politicians` dataset is included in the `data/` directory.

Options

Creating a clustering is handled by the src/label_propagation.py script which provides the following command line arguments.

Model options

  --input               STR    Input graph path.                          Default is `data/politician_edges.csv`.                                     
  --assignment-output   STR    Node-cluster assignment dictionary path.   Default is `output/politician.json`.
  --weighing            STR    Weighting strategy.                        Default is `overlap`.
  --rounds              INT    Number of iterations.                      Default is 30.
  --seed                INT    Initial seed           .                   Default is 42.

Examples

The following commands create cluster assignments and writes them to disk.

Creating communities for the default dataset with the default hyperparameter settings.

$ python src/label_propagation.py

Using unit weighted label propagation.

$ python src/label_propagation.py --weighting unit

Changing the random seed.

$ python src/label_propagation.py --seed 32

Using label propagation with 100 iteration rounds.

$ python src/label_propagation.py --rounds 100

License


labelpropagation's People

Contributors

benedekrozemberczki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

labelpropagation's Issues

Traceback (most recent call last): File "label_propagation.py", line 17, in <module> create_and_run_model(args) File "label_propagation.py", line 12, in create_and_run_model model = LabelPropagator(graph,args) File "/home/rishika/Downloads/LabelPropagation-master/src/model.py", line 27, in __init__ self.weight_setup(args.weighting) File "/home/rishika/Downloads/LabelPropagation-master/src/model.py", line 35, in weight_setup self.weights = overlap_generator(overlap, self.graph) File "/home/rishika/Downloads/LabelPropagation-master/src/calculation_helper.py", line 53, in overlap_generator edges = edges +[(edge[1], edge[0]) for edge in edges] TypeError: unsupported operand type(s) for +: 'EdgeView' and 'list'

please help with the above issue

Ground truth

Hi.thank you for this amazing source code.I use this source code for my college project.
But I need ground truth of this dataset.may you send this to me?please😢😥

exec error

when i run : python3 src/label_propagation.py
results: ImportError: cannot import name 'modularity' from 'community'

package modularity and community are both installed before.
how can i fix this?

TypeError: unsupported operand type(s) for +: 'EdgeView' and 'list'

Hi,
I had this error with a fresh install and using your politician_edges.csvas a testing example.

(env) mwon@mwon:/disk2/MP2Vec/LabelPropagation/src$ python label_propagation.py --input ../data/politician_edges.csv --seed 10 
+-------------------+------------------------------+
|     Parameter     |            Value             |
+===================+==============================+
| Assignment output | ./output/politician.json     |
+-------------------+------------------------------+
| Input             | ../data/politician_edges.csv |
+-------------------+------------------------------+
| Rounds            | 30                           |
+-------------------+------------------------------+
| Seed              | 10                           |
+-------------------+------------------------------+
| Weighting         | overlap                      |
+-------------------+------------------------------+
Traceback (most recent call last):
  File "label_propagation.py", line 17, in <module>
    create_and_run_model(args)
  File "label_propagation.py", line 11, in create_and_run_model
    model = LabelPropagator(graph, args)
  File "/disk2/MP2Vec/LabelPropagation/src/model.py", line 27, in __init__
    self.weight_setup(args.weighting)
  File "/disk2/MP2Vec/LabelPropagation/src/model.py", line 35, in weight_setup
    self.weights  = overlap_generator(overlap, self.graph)
  File "/disk2/MP2Vec/LabelPropagation/src/calculation_helper.py", line 53, in overlap_generator
    edges = edges + [(edge[1], edge[0]) for edge in edges]
TypeError: unsupported operand type(s) for +: 'EdgeView' and 'list'

Readme has error

Hi, I have been following you for a long time , your code is perfect
However, in Read me
python src/embedding_clustering.py --rounds 100
May be you copy this command from GEMSEC?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.