root-11 / graph-theory Goto Github PK

A simple graph library

License: MIT License

Python 100.00%

assignment-problem flow-problem graph graph-algorithms graph-library graph-theory graphs minimum-spanning-trees shortest-path topological-sort tsp-solver

graph-theory's People

Contributors

Stargazers

Watchers

Forkers

alimahmoudi29 kgashok elvissmog mayukhbagchitrento tiagoooliveira fiendish caiyingchun vishalbelsare lanicon stjordanis qasim-at-tci asoucase leparalamapara sanju910 hmaerki cryptob1can0b adugnanecho

graph-theory's Issues

Missing path in all_paths

Given the following graph:

g = graph.Graph()
g.add_node('a')
g.add_node('b')
g.add_node('c')
g.add_node('d')
g.add_node('e')

g.add_edge('a', 'b', bidirectional=True)
g.add_edge('b', 'c', bidirectional=True)
g.add_edge('b', 'd', bidirectional=True)
g.add_edge('c', 'd', bidirectional=True)
g.add_edge('c', 'e', bidirectional=True)
g.add_edge('d', 'e', bidirectional=True)

A call to graph.all_paths(g, 'e', 'a') returns:
[['e', 'c', 'b', 'a'], ['e', 'd', 'b', 'a'], ['e', 'c', 'd', 'b', 'a']]

This is missing ['e', 'd', 'c', 'b', 'a']

Scheduling problems to be solved.

Scheduling
- employee scheduling problem (https://en.wikipedia.org/wiki/Nurse_scheduling_problem)
- job shop scheduling problem
- load planning problem

is_connected errors if n1 has no edges

It gives a key error at

graph-theory/graph/__init__.py

Line 307 in c46ebd3

for c in self._edges[n]:

visuals.plot_2d ignores boolean flags

The function just needs to nest those parts of the code inside if statements as is done in plot_3d.

Detection of convex hull on Graph3d

Detect convex hull on XY and XYZ graph. Example of xy graph is available here: https://root-11.github.io/content/calculating-the-convex-hull/index.html

Implementation of cluster analysis

Description: Cluster analysis (https://en.wikipedia.org/wiki/Cluster_analysis)

Add reference implementation of Route inspection problem

https://en.wikipedia.org/wiki/Route_inspection_problem

Proposal: Throw matplotlib out from requirements.

The package is larger than all of GT.
How about we throw it out? E.g. as optional install?

Simplex Network Algorithm for min cost flow problem

Implementation of the Simplex Network Algorithm for min cost problem. example.

Deleting a node doesn't fully eliminate it from the graph

It appears that when you delete a node, certain traces of it are left behind in the graph's internal bookkeeping structures.

Consider this simple graph:

g = graph.Graph()
g.add_edge(1, 3)
g.add_edge(2, 3)

If we now try to do a topological sort on it by iteratively finding the edges with in_degree=0:

the_sort = []

zero_in = g.nodes(in_degree=0)
while zero_in:
    for n in zero_in:
        the_sort.append(n)
        g.del_node(n)
    zero_in = g.nodes(in_degree=0)
    print(f"remaining nodes: {len(g.nodes())},"
          f" in_degree=0 nodes: {len(g.nodes(in_degree=0))}")

...this loop will never terminate. The remaining nodes will be reported as 0 (len(g.nodes())), however there will
remain nodes in the graph with in_degree=0 (len(g.nodes(in_degree=0))). When looking at the code, it appears to
be a problem with the accounting done in del_edge(); self._in_degree[node2] is decremented, but the dict entry
probably should be deleted if self._out_degree[node2] is zero as well. However, since del_edge() is a public method,
it might be the case that having a floating node with no inbound or outbound edges is totally acceptable, in which case
you might need something else to decide when to actually get rid of the entry for a deleted node in these two dicts.

`eq` (equality) can return false for two identical graphs if an edge was added and removed in one of them

If one graph had a edge added and then removed again, comparing it to an identical graph can return false. This occurs it the added and removed edge is outgoing from an node that had no outgoing edges before.

To reproduce

Python 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] on win32
>>> from graph import Graph
>>> g1 = Graph(from_list=[(1,2),(2,3)])
>>> g2 = g1.copy()
>>> g1 == g2
True
>>> g1.add_edge(3,1)
>>> g1.del_edge(3,1)
>>> g1 == g2
False
>>> g1.edges() == g2.edges()
True
>>> g1.nodes() == g2.nodes()
True

Cause

As far as I can tell, this bug occurs, because __eq__ compares the two graphs private edge variables _edges instead of calling the public edge getter edges().
When an edge from an node, that did not have any outgoing edges before, is added and removed, that edges name remains as a key in the graph's internal edge variable _edges. Two dictionaries are unequal if one has a key the other has not, even if all values are equal, thus the two graphs are unequal.

Other examples

Because copy() uses the public edge getter edges() this bug can lead to some very confusing behavior:

Python 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] on win32
>>> from graph import Graph
>>> g1 = Graph(from_list=[(1,2),(2,3)])
>>> g1.add_edge(3,1)
>>> g1.del_edge(3,1)
>>> g2 = g1.copy()
>>> g1 == g2
False

Additional information

I'm using version 2023.7.5 of graph-theory with Python 3.10 on windows.

Move tests to pytest for py-3.10 and use github actions instead of travis.

`eq` (equality) can in some cases return false for two identical graphs

In some cases the private _edges of a graph can gain an empty dict as a entry. This leads to incorrect behavior when comparing graphs.
This bug is very similar to #40.

To reproduce

Python 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] on win32
>>> from graph import Graph
>>> g1 = Graph(from_list=[(1,2),(2,3)])
>>> g2 = Graph(from_list=[(1,2),(2,3)])
>>> g1.edge(3,1)
>>> g1 == g2
False
>>> g1._edges
defaultdict(<class 'dict'>, {1: {2: 1}, 2: {3: 1}, 3: {}})
>>> g2._edges
defaultdict(<class 'dict'>, {1: {2: 1}, 2: {3: 1}})

Cause

The equality function of graphs compares the private variable _edges of both graphs and if they are not equal, the function returns false.
The variable _edges can in some cases be accessed with keys that don't exist, since it is a defaultdict(dict) in those cases a empty dict is created for that key. If that happens with only one of two identical graphs, they are no longer equal.
The accessing of _edges with not existing keys and thus the creation of a new entry can occur when edge(node1, node2) is called with a node1 without outgoing edges or sometimes when is_connected(n1, n2) is called depending on the graph. Those are the two examples I found, there could be more.

Additional information

I'm using version 2023.7.6 of graph-theory with Python 3.10 on windows.

Example of page rank algorithm

Implement example of the page rank algorithm with graph.

Add example that generates a radix tree

https://en.wikipedia.org/wiki/Radix_tree

Automatic conversion of column of text to radix tree would be a neat helper for many cases. In particular as a radix tree is quick to update in contrast to a b-tree and therefore can be useful as an index for a table that changes quickly

Vehicle routing problems

Implementation is requested of VRP:

is_connected can probably be merged with breadth_first_search

In the performance-agnostic case, is_connected(graph, start, end) is just bool(breadth_first_search(graph, start, end)), though BFS could optimize a few things further by e.g. taking an argument to control whether the found path gets reconstructed or not.

branch and bound algorithm for TSP isn't solid.

To recreate:

def test_random_graph_3_bnb():
    for i in range(8,15):
        d = None
        for j in range(3):
            g = random_xy_graph(i, x_max=800, y_max=400)  # a fully connected graph.
            start = time.process_time()
            d1, t1 = g.solve_tsp('bnb')  # tsp_branch_and_bound(g)
            d2, t2 = g.solve_tsp('greedy')  # tsp_greedy(g)
            assert d1 <= d2, (d1, d2, g.edges())
            if d is None:
                d = d1
            else:
                assert d == d1, (d, d1)

            end = time.process_time()
            print(i, j, end-start)

Traceback (most recent call last):
  File "C:/Users/madsenbj/AppData/Roaming/JetBrains/PyCharm2020.2/scratches/scratch_7.py", line 54, in <module>
    test_random_graph_3_bnb()
  File "C:/Users/madsenbj/AppData/Roaming/JetBrains/PyCharm2020.2/scratches/scratch_7.py", line 42, in test_random_graph_3_bnb
    assert d1 <= d2, (d1, d2, g.edges())
AssertionError: (2293.897719652855, 2004.6354644817718, [((655, 58), (559, 45), 96.87620966986684), ((655, 58), (229, 72), 426.2299848673249), ((655, 58), (26, 380), 706.6293229126569), ((655, 58), (693, 380), 324.2344830520036), ((655, 58), (605, 217), 166.67633305301626), ((655, 58), (755, 53), 100.12492197250393), ((655, 58), (282, 126), 379.14772846477666), ((559, 45), (26, 380), 629.5347488423495), ((559, 45), (605, 217), 178.04493814764857), ((559, 45), (693, 380), 360.8060420780118), ((559, 45), (655, 58), 96.87620966986684), ((559, 45), (229, 72), 331.1027030998086), ((559, 45), (282, 126), 288.60006930006097), ((559, 45), (755, 53), 196.16319736382766), ((26, 380), (693, 380), 667.0), ((26, 380), (559, 45), 629.5347488423495), ((26, 380), (229, 72), 368.8807395351511), ((26, 380), (655, 58), 706.6293229126569), ((26, 380), (755, 53), 798.9806005154318), ((26, 380), (282, 126), 360.6272313622475), ((26, 380), (605, 217), 601.5064421932653), ((693, 380), (755, 53), 332.8257802514703), ((693, 380), (605, 217), 185.23768515072737), ((693, 380), (229, 72), 556.9201019895044), ((693, 380), (559, 45), 360.8060420780118), ((693, 380), (282, 126), 483.15318481823135), ((693, 380), (655, 58), 324.2344830520036), ((693, 380), (26, 380), 667.0), ((229, 72), (605, 217), 402.9900743194552), ((229, 72), (755, 53), 526.3430440311718), ((229, 72), (26, 380), 368.8807395351511), ((229, 72), (693, 380), 556.9201019895044), ((229, 72), (655, 58), 426.2299848673249), ((229, 72), (559, 45), 331.1027030998086), ((229, 72), (282, 126), 75.66372975210778), ((605, 217), (655, 58), 166.67633305301626), ((605, 217), (229, 72), 402.9900743194552), ((605, 217), (26, 380), 601.5064421932653), ((605, 217), (755, 53), 222.25210910135362), ((605, 217), (559, 45), 178.04493814764857), ((605, 217), (282, 126), 335.574134879314), ((605, 217), (693, 380), 185.23768515072737), ((755, 53), (229, 72), 526.3430440311718), ((755, 53), (693, 380), 332.8257802514703), ((755, 53), (559, 45), 196.16319736382766), ((755, 53), (605, 217), 222.25210910135362), ((755, 53), (655, 58), 100.12492197250393), ((755, 53), (282, 126), 478.60004178854814), ((755, 53), (26, 380), 798.9806005154318), ((282, 126), (26, 380), 360.6272313622475), ((282, 126), (229, 72), 75.66372975210778), ((282, 126), (755, 53), 478.60004178854814), ((282, 126), (693, 380), 483.15318481823135), ((282, 126), (655, 58), 379.14772846477666), ((282, 126), (559, 45), 288.60006930006097), ((282, 126), (605, 217), 335.574134879314)])

Feature: Option: Add regions to graphs

Minimal illustration of the problem

(classic graph vs multi-graph)

Assume G is a binary tree with a root and 2 levels if bifurcation resulting in $2^{2}$ leaves with randomized weights on the edges.

Assume that all search starts at the root and ends by identifying the route to a leaf using BFS to determine the shortest path.

Problem: Due to the symmetric nature of the graph, shortest path BFS will practically visit every node every time a search is performed.

Proposition 1: If (!) G is redesigned such that the graph is holds information about what can be found below each bifurcation point, only 10 nodes need to be visited. This is ideal from a search perspective, but the memory overhead is problematic as it requires the graph to store all leaves at all bifurcation levels: ~10x more memory. A second problem with this approach is that it only works for DAGs.

Proposition 2: If a partition of G can be declared as a another graph G' and BFS and shortest-path search can query G' to whether or not it contains or has a route to the target node, then the search can be accelerated:

If the target node is in G' and BFS sees G' as a single node in G, then the destination node has been found.
If the target node is NOT in G', BFS can eliminate the search through G' all together.

For the binary tree this means that G defined as $G_{1}' + G_{2}' = G_{1.1}' + G_{1.2}' + G{2.1}' + G_{2.2}...$ a BFS or shortest-path will require only $2*10$ recursive queries akin to "is target in G'".

The reason for 2*10 is because at each recursive step the binary partition will have at least one failure.

Edges cases:

For non-trees, such as road networks, which may be partitioned using the "AA", "A", "B", ... road network classification, each branch will lead to a $G_{n}'$ where knowing the probability of reaching the target (for example using (lat, lon)-distance) will help to accelerate the search, but if such information isn't available - for example in information networks - the better method is to partition by proximity e.g. in clusters of $G/2$-nodes. The search must thereby treat G' as nodes that either have been visited or not.

Make examples with explanations in .pynb

I'm adding this issue, as I've received a request to add an examples section for the github repo, so that the more sophisticated examples are explained using jupyter notebooks, with images and visualisations. The first idea is to link the table of functions from the readme.md to directly to the examples:

(cut for brevity)

An example could be:

which explains how the method works.

Suggestions are welcome

The lru cache on is_connected causes invalid results if new edges are added

Adding new edges has to clear the cache, otherwise you get:

>>> from graph import Graph
>>> a = Graph()
>>> a.is_connected(1, 2)
False
>>> a.add_edge(1, 2)
>>> a.is_connected(1, 2)
False

Consider (optionally?) lazily delaying graphics imports

I know that pep8 says imports should always be at the top, but the plot_3d import here is quite slow and delays startup noticeably for short scripts

graph-theory/graph/__init__.py

Line 7 in d64a40d

from graph.visuals import plot_3d

Maybe consider moving it down to the plot function where plot_3d is actually called?

Feature request: minimum cut function

Hello,
I know that NetworkX has a direct function to implement min cut max flow algorithm minimum_cut(). It works well with small graphs but fails for large graphs.

I checked graph-theory and found that it has test_flow_problem.py but I couldn't find a direct minimum_cut() function like NetworkX.

This is a feature request to add this functionality in graph-theory.

graph.Graph.shortest_path gives incorrect output with weighted graphs when using memoize=True

Code:
import graph
g = graph.Graph()
g.add_edge(1, 2, 3, False)
g.add_edge(2, 3, 4, False)
g.add_edge(1, 3, 10, False)
print(g.shortest_path(1, 3, memoize=True))

Expected output:
(7, [1, 2, 3])

Actual output:
(10, [1, 3])

`has_cycles` allways returns false if the graph is disconnected

The function has_cycles allways returns false if the graph is disconnected, including when a cycle exists.

If this is the intended behavior, the documentation should be modified to clearly convey that.

To reproduce

Python 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] on win32
>>> from graph import Graph
>>> g = Graph(from_list=[(1,2),(2,3),(3,1)])
>>> g.has_cycles()
True
>>> g.add_node(4)
>>> g.has_cycles()
False

Additional information

I'm using version 2023.7.4 of graph-theory with Python 3.10 on windows.

An more sophisticated flow problem example?

Check if this could belong in examples. @04t02

from collections import defaultdict

from graph import Graph


def calculate_max_traffic(graph, unit_quantities):
    max_traffic = defaultdict(int)

    for in_node, out_nodes in unit_quantities.items():
        for out_node, quantity in out_nodes.items():
            _, path = graph.shortest_path(in_node, out_node)

            path_steps = list(zip(path[:-1], path[1:]))

            for (first_node, second_node) in path_steps:
                max_traffic[(first_node, second_node)] += quantity

    return max_traffic


# Example usage:
graph_edges = ["in_1", "out_1", "in_2", "out_2", "in_3", "out_3", "in_4", "out_4"]
graph_edges_pairs = list(zip(graph_edges[:-1], graph_edges[1:]))
graph_edges_pairs.append((graph_edges[-1], graph_edges[0]))

g = Graph(from_list=graph_edges_pairs)
unit_quantities = {"in_1": {"out_1": 10, "out_2": 5, "out_3": 5, "out_4": 5},  # 25
                   "in_2": {"out_2": 10, "out_3": 5, "out_4": 5, "out_1": 5},  # 25
                   "in_3": {"out_3": 5, "out_4": 5, "out_1": 5, "out_2": 10},  # 25
                   "in_4": {"out_4": 5, "out_1": 5, "out_2": 10, "out_3": 5}}  # 25

max_traffic = calculate_max_traffic(g, unit_quantities)

for (first_node, second_node), quantity in max_traffic.items():
    print(f"{first_node} -> {second_node}: {quantity}")

revision of the transshipment problem

https://en.wikipedia.org/wiki/Transshipment_problem

root-11 / graph-theory Goto Github PK

graph-theory's People

Contributors

Stargazers

Watchers

Forkers

graph-theory's Issues

To reproduce

Cause

Other examples

Additional information

To reproduce

Cause

Other problems

Additional information

To reproduce

Additional information

Recommend Projects

Recommend Topics

Recommend Org