Comments (12)
Apologies for taking a while to get back to this. This would be worth talking about actually, if you have some time tomorrow after the usual meeting. Or happy to discuss another time if that doesn't work.
Regarding the proposal of DUGraph + all variants versus DBCUGraph, I am actually not super sure myself. I do think not supporting explicit graph types ADMG or CPDAG would be weird interface-wise, but perhaps the right way to think about this if we don't implement any algorithms inside the graphs themselves.
For 1., maybe we can lazily evaluate the validity of the graph (e.g. algorithms don't assume it's a valid instance of the class)? I think you're right that e.g. checking validity at each step is probably not going to be efficient or very tedious to
I think 2. makes the most sense design wise. Perhaps we can talk about potential API designs?
from pywhy-graphs.
In a more general case, we also have the case of an ADMG and a MAG. In the case of a MAG, the difference is that there is only one edge allowed among nodes, and that there are no primitive inducing paths among non-adjacent nodes.
from pywhy-graphs.
So the way i would imagine it is we have a base directed-undirected edge graph class, and then constructors for each type of graph that enforce whatever requirements the specific graph requires. I'll take a stab at this some time this week.
from pywhy-graphs.
So the way i would imagine it is we have a base directed-undirected edge graph class, and then constructors for each type of graph that enforce whatever requirements the specific graph requires. I'll take a stab at this some time this week.
Do we need a constructor for each type of graph necessarily? Perhaps just assume user knows what they are doing and then provide a function to check validity? Or is that too expensive...
Do you think we should also have a base directed-undirected-bidirected-circle graph class then as well? This could lead to a PAG, ADMG (causal DAG w/ bidirectional edges) and a MAG? We just need functional checks to make sure the corresponding PAG, ADMG and MAG are valid?
from pywhy-graphs.
This is sort of how ananke is designed - we have a base graph class that has all the edge types and then every kind of graph we actually use is a subclass of that. If we had a directed-undirected-bidirected-circle graph class then everything could just be constructed from that class I suppose. I imagine that for e.g. ADMGs you would just set the undirected and circle parts to empty graphs by default?
from pywhy-graphs.
@adam2392 I just reviewed the PAG code and there is a function
is_valid_mec_graph
that appears to enforce the validity check in thePAG
class. Is that understanding correct?I believe this only enforces the validity when adding edges. Probably poorly named. I think we can refactor these names once we decide on a proper class for:
* mixed-edge graphs that support Chain Graphs, CPDAG, PDAGs and their extensions * mixed-edge graphs that support ADMG/DAGs and their extensions * mixed-edge graphs that support PAGs and their extensions
What are some good generic names for the directed/undirected graph and the directed/bidirected graph and the directed/bidirected/circle/undirected graph?
Referencing discussion in #67.
As far as names we could always go DUGraph, DBGraph, DBCUGraph. I am still of the opinion that we should create an API so that users don't have to call these graph types directly, but that we have better organization and less duplication of code in the implementation. Best of both worlds, I suppose. Consequently the naming shouldn't be that important.
So we have 4 types of edges - directed, bidirected, circle, and undirected. We could in theory create all 4 choose k subtypes for k > 0 (starting with the three that you mentioned above). But if we are creating e.g. a directed-bidirected graph, and then setting the bidirected argument to be empty, it seems somewhat arbitrary? Why would we not take the directed-bidirected-circle-undirected, and then just set all but directed to be empty grpahs?
It seems like we could create the one DBCUGraph with all four edge types and then for each kind of graph we actualyl want to consider (PAG, ADMG, DAG, CPDAG etc.) we create a wrapper around DBCUGraph that zeroes out the appropriate edge graphs.
If we are seriously following the networkx style of detaching the graphical representation from the operations on the graph (including checks such as validity) then this implementation should suffice, seeing as there should not be much need for any inheritance behaviors.
from pywhy-graphs.
I propose to start with the following, and will open a PR/RFC to this effect:
- A new DiUnGraph (will take suggestions on names) that contains bidirected and directed edge types
- Provide constructors for CPDAG/PDAG and ChainGraph (that construct a valid DiUnGraph of each type)
- Implement functions that check validity of each of these graphs.
Future proposals:
- Create DiBiGraph for ADMGs
- DAGs should use DiGraph. There should be nothing in ADMG that we require for DAGs
- DiBiCiUnGraph (will take suggestions on this name...) for PAGs and extensions.
from pywhy-graphs.
As far as names we could always go DUGraph, DBGraph, DBCUGraph. I am still of the opinion that we should create an API so that users don't have to call these graph types directly, but that we have better organization and less duplication of code in the implementation. Best of both worlds, I suppose. Consequently the naming shouldn't be that important.
It seems like we could create the one DBCUGraph with all four edge types and then for each kind of graph we actualyl want to consider (PAG, ADMG, DAG, CPDAG etc.) we create a wrapper around DBCUGraph that zeroes out the appropriate edge graphs.
To summarize and clarify, is your proposal open to the idea of either:
- a series of base graph classes: DUGraph, DBGraph, DBCUGraph
- one base graph class DBCUGraph
? I don't have a great reason to support one over the other currently :p. But see below on some cons of supporting explicitly the class types ADMG, PAG, DAG, CPDAG, etc. The graph classes are not guaranteed without a check to be a "valid ADMG/PAG/CPDAG/etc.". This is an issue that was raised in #67.
If we are seriously following the networkx style of detaching the graphical representation from the operations on the graph (including checks such as validity) then this implementation should suffice, seeing as there should not be much need for any inheritance behaviors.
The reason the networkx design is "nice" imo (but not ideal) is that checking for validity of a certain "type of graph" is quite expensive. E.g. if we want to ensure that a constructed nx.DiGraph
is a DAG, then running nx.is_directed_acyclic(self)
every single time an edge is added/removed is not good (note this is what they do in pgmpy's implementation, which is why I'm not a fan). The same logic extends to our cases.
from pywhy-graphs.
As an aside, seeing the implementation of CG and CPDAG as a subclass of DUGraph, I'm pondering the tradeoffs here:
- Support explicit causal graph classes: Can we guarantee at all stages of construction and modification of the graph that the graph is "valid"? Can we do this efficiently?
- Don't support explicit causal graph classes: Can we make sure our documentation is clear and an API is very clear on how to construct various causal graphs and optionally check their validity?
If we can't arrive at a consensus here, we can discuss next week, or something?
from pywhy-graphs.
cc: @robertness @bloebp @kunwuz
from pywhy-graphs.
Discussion with @jaron-lee we have decided the following strategy:
- Implement 4 private abstractions on top of
MixedEdgeGraph
that enables all causal-type graphs: DiBiGraph, DiUnGraph, DiBiUnGraph, DiBiUnCiGraph - Still support specific causal graphs and keep API as is... for now while noting in all the documentation that VALIDITY IS NOT GUARANTEED unless they call specific functions that check it. E.g.
is_valid_cpdag(...)
,is_valid_pag(...)
, which are still TBD implemented
In the longer-term, we can consider removing all the specific causal graph class implementations, so that way it's more like networkx, but in the short-term, we can just rely on this approach to be internally lean, and rely on user-feedback to guide this decision.
Feel free to add anything I missed.
from pywhy-graphs.
Check bnlearn, and Meek for the CPDAG definition.
from pywhy-graphs.
Related Issues (20)
- [ENH] Add the ability to find Proper Possibly Directed Paths in a Graph HOT 10
- Bug in pywhy_graphs.viz.draw()
- FEA A function to determine if an edge is "visible" HOT 3
- FEA generator function for m-separating sets
- [DOC] Example and Userguide section explaining the differences wrt networkx HOT 4
- Check edge types allowed given a `graph_type` passed in export functions HOT 1
- Implement pre-commit hooks
- Checking the validity of a constructed PAG HOT 7
- Bug in draw brought up by researcher HOT 1
- A function to determine whether an inducing path exists between two nodes HOT 6
- Bug in CPDAG drawing HOT 2
- NetworkX related CI checks fail for Python 3.8
- Update `intro` folder of Sphinx `examples/` to show intuition and concepts of inducing path HOT 5
- Convert MAG to PAG HOT 5
- Check the Validity of an MAG HOT 1
- Convert PAG to MAG
- Convert DAG to a MAG
- Check the Markov Equivalence of two MAGs HOT 2
- [ENH] Test the pag_to_mag function with PAG containing "--o" edge.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pywhy-graphs.