Git Product home page Git Product logo

random_walk_controversy's Introduction

Random Walk Controversy

Tests

This repo contains a parallel implementation of Kiran Garimella's algorithm to compute the random walk controversy score of a graph.

Let G(V,E) be a graph with two partitions X and Y; in the paper "Quantifying Controversy on Social Media" Garimella et al. define the Random Walk Controversy (RWC) measure as follows: "Consider two random walks, one ending in partition X and one ending in partition Y , RWC is the difference of the probabilities of two events: (i) both random walks started from the partition they ended in and (ii) both random walks started in a partition other than the one they ended in.โ€.
The measure is quantified as $RWC = P_{XX}P_{YY} - P_{YX}P_{XY}$ where $P_{AB}$ is the conditional probability $P_{AB} = Pr[\mbox{start in partition }A \mid \mbox{end in partition }B]$.

Since the probabilities are computed by making simulations consisting in random walks and the simulations are independent of each other, they can easily be done in parallel.
The following table shows the performance and results comparison made between the (sequential) implementation provided by one of the original authors and this implementation. The datasets can be found in the same repo as the original authors' implementation.

Dataset Seq. time (s) Seq. RWC score Par. time (s) Par. RWC score Speedup
baltimore 618 0.869 71 0.872 8.70
beefban 128 0.873 19 0.882 6.73
gunsense 2232 0.851 221 0.853 10.10
indiana 229 0.720 37 0.727 6.19
indiasdaughter 490 0.825 62 0.832 7.90

Beside the evident speedup obtained by exploiting a multicore architecture, we can see that the sequential and parallel versions almost converge to the same results.

Installation

To install the latest version of the library just download (or clone) the current project, open a terminal and run the following commands:

pip install -r requirements.txt
pip install .

Alternatively use pip

pip install random-walk-controversy

Usage

Command line interface

python3 -m random_walk_controversy [-h] [-v] [-l] edgelist community1_nodelist community2_nodelist percent n

More info about the parameters can be fetched by using the -h option.
The option -v can be used to increase output verbosity and print, alongside the rwc score, the statistics about random walks. If not specified, only the RWC score is printed out.
Finally, the -l option displays a log on the terminal everytime a simulation is completed; I have found this option pretty usefully since it allows to estimate the time for the algorithm to complete and understand if the algorithm got stuck.

Python library

After the installation, it is possible to compute the rwc score directly in the python interpreter by using the function get_rwc inside the random_walk_controversy package.

Example

>>> from random_walk_controversy import get_rwc
>>> graph = read_edgelist()
>>> side1_nodes = read_nodelist()  # list of nodes belonging to partition 1
>>> side2_nodes = read_nodelist()  # list of nodes belonging to partition 2
>>> node_percentage = 0.3
>>> number_simulations = 1000
>>> get_rwc(graph, side1_nodes, side2_nodes, node_percentage, number_simulations)
76.233

random_walk_controversy's People

Contributors

bendico765 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

giuliorossetti

random_walk_controversy's Issues

MultiDiGraph doesn't work

Hi,
I'm trying to use your algorithm with a NetworkX MultiDiGraph but I obtain the following error:

MultiDiGraph with 97 nodes and 506 edges
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
line 256, in process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
line 63, in perform_simulation
side = perform_random_walk(g, start_node, other_nodes, user_nodes_side2) # get the side we endend after the random walk
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
line 22, in perform_random_walk
current_node = random.choice([
for _ in g.neighbors(current_node)])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
line 370, in choice
raise IndexError('Cannot choose from an empty sequence')
IndexError: Cannot choose from an empty sequence
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
line 48, in
rwc_score = get_rwc(G, republican_nodes, democratic_nodes, node_percentage, number_simulations)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
line 124, in get_rwc
simulation_frequencies = future.result()
^^^^^^^^^^^^^^^
line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
line 401, in __get_result
raise self._exception
IndexError: Cannot choose from an empty sequence

Process finished with exit code 1

Do you know how can I solve?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.