Git Product home page Git Product logo

Comments (12)

jaron-lee avatar jaron-lee commented on September 27, 2024 1

Apologies for taking a while to get back to this. This would be worth talking about actually, if you have some time tomorrow after the usual meeting. Or happy to discuss another time if that doesn't work.

Regarding the proposal of DUGraph + all variants versus DBCUGraph, I am actually not super sure myself. I do think not supporting explicit graph types ADMG or CPDAG would be weird interface-wise, but perhaps the right way to think about this if we don't implement any algorithms inside the graphs themselves.

For 1., maybe we can lazily evaluate the validity of the graph (e.g. algorithms don't assume it's a valid instance of the class)? I think you're right that e.g. checking validity at each step is probably not going to be efficient or very tedious to

I think 2. makes the most sense design wise. Perhaps we can talk about potential API designs?

from pywhy-graphs.

adam2392 avatar adam2392 commented on September 27, 2024

In a more general case, we also have the case of an ADMG and a MAG. In the case of a MAG, the difference is that there is only one edge allowed among nodes, and that there are no primitive inducing paths among non-adjacent nodes.

from pywhy-graphs.

jaron-lee avatar jaron-lee commented on September 27, 2024

So the way i would imagine it is we have a base directed-undirected edge graph class, and then constructors for each type of graph that enforce whatever requirements the specific graph requires. I'll take a stab at this some time this week.

from pywhy-graphs.

adam2392 avatar adam2392 commented on September 27, 2024

So the way i would imagine it is we have a base directed-undirected edge graph class, and then constructors for each type of graph that enforce whatever requirements the specific graph requires. I'll take a stab at this some time this week.

Do we need a constructor for each type of graph necessarily? Perhaps just assume user knows what they are doing and then provide a function to check validity? Or is that too expensive...

Do you think we should also have a base directed-undirected-bidirected-circle graph class then as well? This could lead to a PAG, ADMG (causal DAG w/ bidirectional edges) and a MAG? We just need functional checks to make sure the corresponding PAG, ADMG and MAG are valid?

from pywhy-graphs.

jaron-lee avatar jaron-lee commented on September 27, 2024

This is sort of how ananke is designed - we have a base graph class that has all the edge types and then every kind of graph we actually use is a subclass of that. If we had a directed-undirected-bidirected-circle graph class then everything could just be constructed from that class I suppose. I imagine that for e.g. ADMGs you would just set the undirected and circle parts to empty graphs by default?

from pywhy-graphs.

jaron-lee avatar jaron-lee commented on September 27, 2024

@adam2392 I just reviewed the PAG code and there is a function is_valid_mec_graph that appears to enforce the validity check in the PAG class. Is that understanding correct?

I believe this only enforces the validity when adding edges. Probably poorly named. I think we can refactor these names once we decide on a proper class for:

* mixed-edge graphs that support Chain Graphs, CPDAG, PDAGs and their extensions

* mixed-edge graphs that support ADMG/DAGs and their extensions

* mixed-edge graphs that support PAGs and their extensions

What are some good generic names for the directed/undirected graph and the directed/bidirected graph and the directed/bidirected/circle/undirected graph?

Referencing discussion in #67.

As far as names we could always go DUGraph, DBGraph, DBCUGraph. I am still of the opinion that we should create an API so that users don't have to call these graph types directly, but that we have better organization and less duplication of code in the implementation. Best of both worlds, I suppose. Consequently the naming shouldn't be that important.

So we have 4 types of edges - directed, bidirected, circle, and undirected. We could in theory create all 4 choose k subtypes for k > 0 (starting with the three that you mentioned above). But if we are creating e.g. a directed-bidirected graph, and then setting the bidirected argument to be empty, it seems somewhat arbitrary? Why would we not take the directed-bidirected-circle-undirected, and then just set all but directed to be empty grpahs?

It seems like we could create the one DBCUGraph with all four edge types and then for each kind of graph we actualyl want to consider (PAG, ADMG, DAG, CPDAG etc.) we create a wrapper around DBCUGraph that zeroes out the appropriate edge graphs.

If we are seriously following the networkx style of detaching the graphical representation from the operations on the graph (including checks such as validity) then this implementation should suffice, seeing as there should not be much need for any inheritance behaviors.

from pywhy-graphs.

jaron-lee avatar jaron-lee commented on September 27, 2024

I propose to start with the following, and will open a PR/RFC to this effect:

  • A new DiUnGraph (will take suggestions on names) that contains bidirected and directed edge types
  • Provide constructors for CPDAG/PDAG and ChainGraph (that construct a valid DiUnGraph of each type)
  • Implement functions that check validity of each of these graphs.

Future proposals:

  • Create DiBiGraph for ADMGs
  • DAGs should use DiGraph. There should be nothing in ADMG that we require for DAGs
  • DiBiCiUnGraph (will take suggestions on this name...) for PAGs and extensions.

from pywhy-graphs.

adam2392 avatar adam2392 commented on September 27, 2024

As far as names we could always go DUGraph, DBGraph, DBCUGraph. I am still of the opinion that we should create an API so that users don't have to call these graph types directly, but that we have better organization and less duplication of code in the implementation. Best of both worlds, I suppose. Consequently the naming shouldn't be that important.
It seems like we could create the one DBCUGraph with all four edge types and then for each kind of graph we actualyl want to consider (PAG, ADMG, DAG, CPDAG etc.) we create a wrapper around DBCUGraph that zeroes out the appropriate edge graphs.

To summarize and clarify, is your proposal open to the idea of either:

  1. a series of base graph classes: DUGraph, DBGraph, DBCUGraph
  2. one base graph class DBCUGraph

? I don't have a great reason to support one over the other currently :p. But see below on some cons of supporting explicitly the class types ADMG, PAG, DAG, CPDAG, etc. The graph classes are not guaranteed without a check to be a "valid ADMG/PAG/CPDAG/etc.". This is an issue that was raised in #67.

If we are seriously following the networkx style of detaching the graphical representation from the operations on the graph (including checks such as validity) then this implementation should suffice, seeing as there should not be much need for any inheritance behaviors.

The reason the networkx design is "nice" imo (but not ideal) is that checking for validity of a certain "type of graph" is quite expensive. E.g. if we want to ensure that a constructed nx.DiGraph is a DAG, then running nx.is_directed_acyclic(self) every single time an edge is added/removed is not good (note this is what they do in pgmpy's implementation, which is why I'm not a fan). The same logic extends to our cases.

from pywhy-graphs.

adam2392 avatar adam2392 commented on September 27, 2024

As an aside, seeing the implementation of CG and CPDAG as a subclass of DUGraph, I'm pondering the tradeoffs here:

  1. Support explicit causal graph classes: Can we guarantee at all stages of construction and modification of the graph that the graph is "valid"? Can we do this efficiently?
  2. Don't support explicit causal graph classes: Can we make sure our documentation is clear and an API is very clear on how to construct various causal graphs and optionally check their validity?

If we can't arrive at a consensus here, we can discuss next week, or something?

from pywhy-graphs.

adam2392 avatar adam2392 commented on September 27, 2024

cc: @robertness @bloebp @kunwuz

from pywhy-graphs.

adam2392 avatar adam2392 commented on September 27, 2024

Discussion with @jaron-lee we have decided the following strategy:

  • Implement 4 private abstractions on top of MixedEdgeGraph that enables all causal-type graphs: DiBiGraph, DiUnGraph, DiBiUnGraph, DiBiUnCiGraph
  • Still support specific causal graphs and keep API as is... for now while noting in all the documentation that VALIDITY IS NOT GUARANTEED unless they call specific functions that check it. E.g. is_valid_cpdag(...), is_valid_pag(...), which are still TBD implemented

In the longer-term, we can consider removing all the specific causal graph class implementations, so that way it's more like networkx, but in the short-term, we can just rely on this approach to be internally lean, and rely on user-feedback to guide this decision.

Feel free to add anything I missed.

from pywhy-graphs.

jaron-lee avatar jaron-lee commented on September 27, 2024

Check bnlearn, and Meek for the CPDAG definition.

from pywhy-graphs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.