Git Product home page Git Product logo

graph-theory's People

Contributors

04t02 avatar fiendish avatar qasim-at-tci avatar root-11 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

graph-theory's Issues

Missing path in all_paths

Given the following graph:

g = graph.Graph()
g.add_node('a')
g.add_node('b')
g.add_node('c')
g.add_node('d')
g.add_node('e')

g.add_edge('a', 'b', bidirectional=True)
g.add_edge('b', 'c', bidirectional=True)
g.add_edge('b', 'd', bidirectional=True)
g.add_edge('c', 'd', bidirectional=True)
g.add_edge('c', 'e', bidirectional=True)
g.add_edge('d', 'e', bidirectional=True)

A call to graph.all_paths(g, 'e', 'a') returns:
[['e', 'c', 'b', 'a'], ['e', 'd', 'b', 'a'], ['e', 'c', 'd', 'b', 'a']]

This is missing ['e', 'd', 'c', 'b', 'a']

More examples of assignment problems

Here's a list:

    - Assignment problems (AP)
      - Explain that AP is special case of GAP with just one agent.
       - Explain why the hungarian algorithm is subperformant relative to alternating iterative auction.

    - The Knapsack problem (code done, tutorial missing)
        - Cutting stock problem (https://en.wikipedia.org/wiki/Cutting_stock_problem)
        - 3D bin packing problem

    - Maximum flow (code done, tutorial missing)
    - Minimum costs (code done? )
    - Assignment problem with allowed groups

    - Quadratic assignment problem (https://en.wikipedia.org/wiki/Quadratic_assignment_problem)
      and Facility location problem (https://en.wikipedia.org/wiki/Facility_location_problem)
        - Explain that the problem resembles that of the assignment problem, except that the
          cost function is expressed in terms of quadratic inequalities, hence the name.
        - Example: The problem is to assign all facilities to different locations with the
          goal of minimizing the sum of the distances multiplied by the corresponding flows.

          Hint: Use XY-graph for solving the problem.

Deleting a node doesn't fully eliminate it from the graph

It appears that when you delete a node, certain traces of it are left behind in the graph's internal bookkeeping structures.

Consider this simple graph:

g = graph.Graph()
g.add_edge(1, 3)
g.add_edge(2, 3)

If we now try to do a topological sort on it by iteratively finding the edges with in_degree=0:

the_sort = []

zero_in = g.nodes(in_degree=0)
while zero_in:
    for n in zero_in:
        the_sort.append(n)
        g.del_node(n)
    zero_in = g.nodes(in_degree=0)
    print(f"remaining nodes: {len(g.nodes())},"
          f" in_degree=0 nodes: {len(g.nodes(in_degree=0))}")

...this loop will never terminate. The remaining nodes will be reported as 0 (len(g.nodes())), however there will
remain nodes in the graph with in_degree=0 (len(g.nodes(in_degree=0))). When looking at the code, it appears to
be a problem with the accounting done in del_edge(); self._in_degree[node2] is decremented, but the dict entry
probably should be deleted if self._out_degree[node2] is zero as well. However, since del_edge() is a public method,
it might be the case that having a floating node with no inbound or outbound edges is totally acceptable, in which case
you might need something else to decide when to actually get rid of the entry for a deleted node in these two dicts.

`__eq__` (equality) can return false for two identical graphs if an edge was added and removed in one of them

If one graph had a edge added and then removed again, comparing it to an identical graph can return false. This occurs it the added and removed edge is outgoing from an node that had no outgoing edges before.

To reproduce

Python 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] on win32
>>> from graph import Graph
>>> g1 = Graph(from_list=[(1,2),(2,3)])
>>> g2 = g1.copy()
>>> g1 == g2
True
>>> g1.add_edge(3,1)
>>> g1.del_edge(3,1)
>>> g1 == g2
False
>>> g1.edges() == g2.edges()
True
>>> g1.nodes() == g2.nodes()
True

Cause

As far as I can tell, this bug occurs, because __eq__ compares the two graphs private edge variables _edges instead of calling the public edge getter edges().
When an edge from an node, that did not have any outgoing edges before, is added and removed, that edges name remains as a key in the graph's internal edge variable _edges. Two dictionaries are unequal if one has a key the other has not, even if all values are equal, thus the two graphs are unequal.

Other examples

Because copy() uses the public edge getter edges() this bug can lead to some very confusing behavior:

Python 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] on win32
>>> from graph import Graph
>>> g1 = Graph(from_list=[(1,2),(2,3)])
>>> g1.add_edge(3,1)
>>> g1.del_edge(3,1)
>>> g2 = g1.copy()
>>> g1 == g2
False

Additional information

I'm using version 2023.7.5 of graph-theory with Python 3.10 on windows.

`__eq__` (equality) can in some cases return false for two identical graphs

In some cases the private _edges of a graph can gain an empty dict as a entry. This leads to incorrect behavior when comparing graphs.
This bug is very similar to #40.

To reproduce

Python 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] on win32
>>> from graph import Graph
>>> g1 = Graph(from_list=[(1,2),(2,3)])
>>> g2 = Graph(from_list=[(1,2),(2,3)])
>>> g1.edge(3,1)
>>> g1 == g2
False
>>> g1._edges
defaultdict(<class 'dict'>, {1: {2: 1}, 2: {3: 1}, 3: {}})
>>> g2._edges
defaultdict(<class 'dict'>, {1: {2: 1}, 2: {3: 1}})

Cause

The equality function of graphs compares the private variable _edges of both graphs and if they are not equal, the function returns false.
The variable _edges can in some cases be accessed with keys that don't exist, since it is a defaultdict(dict) in those cases a empty dict is created for that key. If that happens with only one of two identical graphs, they are no longer equal.
The accessing of _edges with not existing keys and thus the creation of a new entry can occur when edge(node1, node2) is called with a node1 without outgoing edges or sometimes when is_connected(n1, n2) is called depending on the graph. Those are the two examples I found, there could be more.

Other problems

Just like with #40 this also causes strange behavior when copying graphs, because copy() uses the public edge getter edges() instead of the private variable _edges.

Python 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] on win32
>>> from graph import Graph
>>> g1 = Graph(from_list=[(1,2),(2,3)])
>>> g1.edge(3,1)
>>> g2 = g1.copy()
>>> g1 == g2
False
>>> g1._edges
defaultdict(<class 'dict'>, {1: {2: 1}, 2: {3: 1}, 3: {}})
>>> g2._edges  
defaultdict(<class 'dict'>, {1: {2: 1}, 2: {3: 1}})

Additional information

I'm using version 2023.7.6 of graph-theory with Python 3.10 on windows.

is_connected can probably be merged with breadth_first_search

In the performance-agnostic case, is_connected(graph, start, end) is just bool(breadth_first_search(graph, start, end)), though BFS could optimize a few things further by e.g. taking an argument to control whether the found path gets reconstructed or not.

branch and bound algorithm for TSP isn't solid.

To recreate:

def test_random_graph_3_bnb():
    for i in range(8,15):
        d = None
        for j in range(3):
            g = random_xy_graph(i, x_max=800, y_max=400)  # a fully connected graph.
            start = time.process_time()
            d1, t1 = g.solve_tsp('bnb')  # tsp_branch_and_bound(g)
            d2, t2 = g.solve_tsp('greedy')  # tsp_greedy(g)
            assert d1 <= d2, (d1, d2, g.edges())
            if d is None:
                d = d1
            else:
                assert d == d1, (d, d1)

            end = time.process_time()
            print(i, j, end-start)

Traceback (most recent call last):
  File "C:/Users/madsenbj/AppData/Roaming/JetBrains/PyCharm2020.2/scratches/scratch_7.py", line 54, in <module>
    test_random_graph_3_bnb()
  File "C:/Users/madsenbj/AppData/Roaming/JetBrains/PyCharm2020.2/scratches/scratch_7.py", line 42, in test_random_graph_3_bnb
    assert d1 <= d2, (d1, d2, g.edges())
AssertionError: (2293.897719652855, 2004.6354644817718, [((655, 58), (559, 45), 96.87620966986684), ((655, 58), (229, 72), 426.2299848673249), ((655, 58), (26, 380), 706.6293229126569), ((655, 58), (693, 380), 324.2344830520036), ((655, 58), (605, 217), 166.67633305301626), ((655, 58), (755, 53), 100.12492197250393), ((655, 58), (282, 126), 379.14772846477666), ((559, 45), (26, 380), 629.5347488423495), ((559, 45), (605, 217), 178.04493814764857), ((559, 45), (693, 380), 360.8060420780118), ((559, 45), (655, 58), 96.87620966986684), ((559, 45), (229, 72), 331.1027030998086), ((559, 45), (282, 126), 288.60006930006097), ((559, 45), (755, 53), 196.16319736382766), ((26, 380), (693, 380), 667.0), ((26, 380), (559, 45), 629.5347488423495), ((26, 380), (229, 72), 368.8807395351511), ((26, 380), (655, 58), 706.6293229126569), ((26, 380), (755, 53), 798.9806005154318), ((26, 380), (282, 126), 360.6272313622475), ((26, 380), (605, 217), 601.5064421932653), ((693, 380), (755, 53), 332.8257802514703), ((693, 380), (605, 217), 185.23768515072737), ((693, 380), (229, 72), 556.9201019895044), ((693, 380), (559, 45), 360.8060420780118), ((693, 380), (282, 126), 483.15318481823135), ((693, 380), (655, 58), 324.2344830520036), ((693, 380), (26, 380), 667.0), ((229, 72), (605, 217), 402.9900743194552), ((229, 72), (755, 53), 526.3430440311718), ((229, 72), (26, 380), 368.8807395351511), ((229, 72), (693, 380), 556.9201019895044), ((229, 72), (655, 58), 426.2299848673249), ((229, 72), (559, 45), 331.1027030998086), ((229, 72), (282, 126), 75.66372975210778), ((605, 217), (655, 58), 166.67633305301626), ((605, 217), (229, 72), 402.9900743194552), ((605, 217), (26, 380), 601.5064421932653), ((605, 217), (755, 53), 222.25210910135362), ((605, 217), (559, 45), 178.04493814764857), ((605, 217), (282, 126), 335.574134879314), ((605, 217), (693, 380), 185.23768515072737), ((755, 53), (229, 72), 526.3430440311718), ((755, 53), (693, 380), 332.8257802514703), ((755, 53), (559, 45), 196.16319736382766), ((755, 53), (605, 217), 222.25210910135362), ((755, 53), (655, 58), 100.12492197250393), ((755, 53), (282, 126), 478.60004178854814), ((755, 53), (26, 380), 798.9806005154318), ((282, 126), (26, 380), 360.6272313622475), ((282, 126), (229, 72), 75.66372975210778), ((282, 126), (755, 53), 478.60004178854814), ((282, 126), (693, 380), 483.15318481823135), ((282, 126), (655, 58), 379.14772846477666), ((282, 126), (559, 45), 288.60006930006097), ((282, 126), (605, 217), 335.574134879314)])



Feature: Option: Add regions to graphs

Minimal illustration of the problem

image
(classic graph vs multi-graph)

Assume G is a binary tree with a root and 2 levels if bifurcation resulting in $2^{2}$ leaves with randomized weights on the edges.

Assume that all search starts at the root and ends by identifying the route to a leaf using BFS to determine the shortest path.

Problem: Due to the symmetric nature of the graph, shortest path BFS will practically visit every node every time a search is performed.

Proposition 1: If (!) G is redesigned such that the graph is holds information about what can be found below each bifurcation point, only 10 nodes need to be visited. This is ideal from a search perspective, but the memory overhead is problematic as it requires the graph to store all leaves at all bifurcation levels: ~10x more memory. A second problem with this approach is that it only works for DAGs.

Proposition 2: If a partition of G can be declared as a another graph G' and BFS and shortest-path search can query G' to whether or not it contains or has a route to the target node, then the search can be accelerated:

  1. If the target node is in G' and BFS sees G' as a single node in G, then the destination node has been found.
  2. If the target node is NOT in G', BFS can eliminate the search through G' all together.

For the binary tree this means that G defined as $G_{1}' + G_{2}' = G_{1.1}' + G_{1.2}' + G{2.1}' + G_{2.2}...$ a BFS or shortest-path will require only $2*10$ recursive queries akin to "is target in G'".

The reason for 2*10 is because at each recursive step the binary partition will have at least one failure.

Edges cases:

For non-trees, such as road networks, which may be partitioned using the "AA", "A", "B", ... road network classification, each branch will lead to a $G_{n}'$ where knowing the probability of reaching the target (for example using (lat, lon)-distance) will help to accelerate the search, but if such information isn't available - for example in information networks - the better method is to partition by proximity e.g. in clusters of $G/2$-nodes. The search must thereby treat G' as nodes that either have been visited or not.

Make examples with explanations in .pynb

I'm adding this issue, as I've received a request to add an examples section for the github repo, so that the more sophisticated examples are explained using jupyter notebooks, with images and visualisations. The first idea is to link the table of functions from the readme.md to directly to the examples:

image

(cut for brevity)
image

An example could be:
image
which explains how the method works.

Suggestions are welcome

Feature request: minimum cut function

Hello,
I know that NetworkX has a direct function to implement min cut max flow algorithm minimum_cut(). It works well with small graphs but fails for large graphs.

I checked graph-theory and found that it has test_flow_problem.py but I couldn't find a direct minimum_cut() function like NetworkX.

This is a feature request to add this functionality in graph-theory.

`has_cycles` allways returns false if the graph is disconnected

The function has_cycles allways returns false if the graph is disconnected, including when a cycle exists.

If this is the intended behavior, the documentation should be modified to clearly convey that.

To reproduce

Python 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)] on win32
>>> from graph import Graph
>>> g = Graph(from_list=[(1,2),(2,3),(3,1)])
>>> g.has_cycles()
True
>>> g.add_node(4)
>>> g.has_cycles()
False

Additional information

I'm using version 2023.7.4 of graph-theory with Python 3.10 on windows.

An more sophisticated flow problem example?

Check if this could belong in examples. @04t02

from collections import defaultdict

from graph import Graph


def calculate_max_traffic(graph, unit_quantities):
    max_traffic = defaultdict(int)

    for in_node, out_nodes in unit_quantities.items():
        for out_node, quantity in out_nodes.items():
            _, path = graph.shortest_path(in_node, out_node)

            path_steps = list(zip(path[:-1], path[1:]))

            for (first_node, second_node) in path_steps:
                max_traffic[(first_node, second_node)] += quantity

    return max_traffic


# Example usage:
graph_edges = ["in_1", "out_1", "in_2", "out_2", "in_3", "out_3", "in_4", "out_4"]
graph_edges_pairs = list(zip(graph_edges[:-1], graph_edges[1:]))
graph_edges_pairs.append((graph_edges[-1], graph_edges[0]))

g = Graph(from_list=graph_edges_pairs)
unit_quantities = {"in_1": {"out_1": 10, "out_2": 5, "out_3": 5, "out_4": 5},  # 25
                   "in_2": {"out_2": 10, "out_3": 5, "out_4": 5, "out_1": 5},  # 25
                   "in_3": {"out_3": 5, "out_4": 5, "out_1": 5, "out_2": 10},  # 25
                   "in_4": {"out_4": 5, "out_1": 5, "out_2": 10, "out_3": 5}}  # 25

max_traffic = calculate_max_traffic(g, unit_quantities)

for (first_node, second_node), quantity in max_traffic.items():
    print(f"{first_node} -> {second_node}: {quantity}")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.