Git Product home page Git Product logo

risgraph's Introduction

RisGraph

Description

This is the open-source implementation for RisGraph:

RisGraph: A Real-Time Streaming System for Evolving Graphs to Support Sub-millisecond Per-update Analysis at Millions Ops/s.
Guanyu Feng, Zixuan Ma, Daixuan Li, Shengqi Chen, Xiaowei Zhu, Wentao Han, Wenguang Chen.
SIGMOD/PODS '21: Proceedings of the 2021 International Conference on Management of Data. https://doi.org/10.1145/3448016.3457263

System Dependency

Compilation

git clone https://github.com/thu-pacman/RisGraph.git --recursive
cd RisGraph
mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j

Preprocessing

The graph format used by RisGraph is binary edge lists with 64-bit vertex IDs (in host byte ordering). RisGraph provides some tools that converts edge lists in text format (such as SNAP datasets) to binary format.

RisGraph simulates sliding windows by inserting and deleting edges based on the sequence of edges from the input dataset.

Sorting edges based on timestamps

# when edges are timestamped
# each line of text_graph_path is an edge with three integers
# source_vertex_id destination_vertex_id edge_timestamp

./convert_to_binary_timestamp < text_graph_path > binary_graph_path

Randomly shuffling edges

# when edges have no specific order (for most of public datasets)
# each line of text_graph_path is an edge with two integers
# source_vertex_id destination_vertex_id

./convert_to_binary_random < text_graph_path > binary_graph_path

Keeping the input ordering

# when edges are already in chronological or custom order
# each line of text_graph_path is an edge with two integers
# source_vertex_id destination_vertex_id

./convert_to_binary < text_graph_path > binary_graph_path

Entire Graph Processing

These applications will process the entire graph.

Breadth-First Search

./bfs binary_graph_path root

Single Source Shortest Path

./sssp binary_graph_path root

Single Source Widest Path

./sswp binary_graph_path root

Weakly Connected Components

# edges are treated as undirected edges
./wcc binary_graph_path

Incremental Processing

These applications will load the first initial_edges_percent edges from the graph, insert and delete an edge as an update (simulating sliding windows), and incrementally process the algorithm.

Breadth-First Search

./bfs_inc binary_graph_path root initial_edges_percent

Single Source Shortest Path

./sssp_inc binary_graph_path root initial_edges_percent

Single Source Widest Path

./sswp_inc binary_graph_path root initial_edges_percent

Weakly Connected Components

./wcc_inc binary_graph_path initial_edges_percent

Incremental Processing with Safe/Unsafe Classification and Latency-aware Scheduler

These applications will load the first initial_edges_percent edges from the graph and simulate num_of_clients clients requesting updates with inserting/deleting an edge (simulating sliding windows).

RisGraph enables safe/unsafe classification for incrementally processing and tries to make tail_latency_percent updates are within target_tail_latency milliseconds through the latency-aware scheduler. It is recommended to set num_of_clients starting with the number of physical cores and try doubling num_of_clients until RisGraph cannot fulfill the expected tail latency.

Breadth-First Search

./bfs_inc_rt binary_graph_path root initial_edges_percent target_tail_latency tail_latency_percent num_of_clients

Single Source Shortest Path

./sssp_inc_rt binary_graph_path root initial_edges_percent target_tail_latency tail_latency_percent num_of_clients

Single Source Widest Path

./sswp_inc_rt binary_graph_path root initial_edges_percent target_tail_latency tail_latency_percent num_of_clients

Weakly Connected Components

./wcc_inc_rt binary_graph_path initial_edges_percent target_tail_latency tail_latency_percent num_of_clients

Incremental Processing with Batched Updates

These applications will load the first initial_edges_percent edges from the graph, insert and delete batch_size edges as a batched update (simulating sliding windows), and incrementally process the algorithm.

Breadth-First Search

./bfs_inc_batch binary_graph_path root initial_edges_percent batch_size

Single Source Shortest Path

./sssp_inc_batch binary_graph_path root initial_edges_percent batch_size

Single Source Widest Path

./sswp_inc_batch binary_graph_path root initial_edges_percent batch_size

Weakly Connected Components

./wcc_inc_batch binary_graph_path initial_edges_percent batch_size

Performance Tuning

  • Binding threads to CPU cores: export OMP_PROC_BIND=true
  • Trying different NUMA policies: In our experiments, numactl --preferred=0 achieves relatively good performance in a wide range of cases.

risgraph's People

Contributors

jiguanglizipao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

risgraph's Issues

installation instructions

Dear author, I have been studying your code for a month, but I still can't run your code. Could you give me a more detailed installation instructions ?Thank you very much.

Abseil-cpp error in compilation

I build RisGraph with gcc-11.4.0, and got error:

/home/ldeng/related_works/RisGraph/deps/abseil-cpp/absl/synchronization/internal/graphcycles.cc: I
n member function ‘void absl::lts_2020_02_25::synchronization_internal::GraphCycles::RemoveNode(vo
id*)’:
/home/ldeng/related_works/RisGraph/deps/abseil-cpp/absl/synchronization/internal/graphcycles.cc:45
1:26: error: ‘numeric_limits’ is not a member of ‘std’
  451 |   if (x->version == std::numeric_limits<uint32_t>::max()) {
      |                          ^~~~~~~~~~~~~~
/home/ldeng/related_works/RisGraph/deps/abseil-cpp/absl/synchronization/internal/graphcycles.cc:45
1:49: error: expected primary-expression before ‘>’ token
  451 |   if (x->version == std::numeric_limits<uint32_t>::max()) {
      |                                                 ^
/home/ldeng/related_works/RisGraph/deps/abseil-cpp/absl/synchronization/internal/graphcycles.cc:45
1:52: error: ‘::max’ has not been declared; did you mean ‘std::max’?
  451 |   if (x->version == std::numeric_limits<uint32_t>::max()) {
      |                                                    ^~~
      |                                                    std::max
In file included from /usr/include/c++/11/algorithm:62,
                 from /home/ldeng/related_works/RisGraph/deps/abseil-cpp/absl/synchronization/inte
rnal/graphcycles.cc:38:
/usr/include/c++/11/bits/stl_algo.h:3467:5: note: ‘std::max’ declared here
 3467 |     max(initializer_list<_Tp> __l, _Compare __comp)
      |     ^~~
make[2]: *** [deps/abseil-cpp/absl/synchronization/CMakeFiles/absl_graphcycles_internal.dir/build.
make:76: deps/abseil-cpp/absl/synchronization/CMakeFiles/absl_graphcycles_internal.dir/internal/gr
aphcycles.cc.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:3633: deps/abseil-cpp/absl/synchronization/CMakeFiles/absl_grap
hcycles_internal.dir/all] Error 2

It seems to be a bug in old version abseil. I manully clone new LTS version of abseil-cpp and compile successfully. Maybe the abseil version that the repository depends on should be updated.

In addition, RisGraph also depends on libaio and boost::fiber (which in also depends on atomic, context, and filesystem in boost), maybe it should also be mentioned in the Readme.

About README

Dear auther, I now have the dependencies installed, but I don't know how to run your code, can you fill in the details in the README, thank you very much!

How does RisGraph implement versioning (snapshot) ?

Dear authors,

Thanks for releasing the source code of the RisGraph paper. I am trying to figure out how the versioning (snapshot) is implemented in RisGraph but cannot find out where this logic is in the source code. I'd be appreciated if you could point out where this logic locates in the source code. Thanks!

conver_to_binary_timestamp throws bad_alloc execption

I encountered the following memory allocation issues when trying to use the tool conver_to_binary_timestamp with the following command:

./convert_to_binary_timestamp < text_graph_file_path > binary_graph_file_path

However, after it runs for a while, it throws an error of

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

I'm using the following dataset from SNAP, and I believe the size is not that big with only few megabytes.

https://snap.stanford.edu/data/soc-Epinions1.html

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.