Comments (3)
+1 for the tracking idea, that seems very useful. I think having one place to put all the causal assumptions is reasonable and would help an analyst keep track of what they are assuming in a particular problem.
Another con regarding the "melting pot" of assumptions is that it would be quite difficult to know in general if all of the causal assumptions specified are even compatible or consistent with one another. I guess this would fall on the analyst to ensure that they have a valid assumption set.
from dodiscover.
Tracking usage is a nice idea, @adam2392 . I like it.
Does it make sense to make the Context class less structured and give a common interface to every assumption? I'm not sure how to explain it without code or pseudocode.... excuse my pseudocode syntax ... What do you think of:
BaseAssumption {
AssertedOrConjectured // some assumptions might be asserted up front by the human, others might be conjectured by an analysis procedure and require attention and testing later.
IsUsed; // was used in the current analysis
TestResults; // we might have results of 1 or more tests (against data, other sources of experimental knowledge; etc
SensitivityAnalysisResults // we might have run sensitivity analyses.
}
and then every assumption (about fixed edges, about linearity, monotonicity, additivity, etc subclasses Assumption
LinearityAssumption(BaseAssumption) {
IsLinear
}
Context {
DictionaryOfAllAssumptions
...
}
Then the Context class becomes a dictionary or bag of these assumptions (or a wrapper around a dictionary to help in bookkeeping). But as we add new algorithms that make new assumptions, we can co-locate the code for the algorithm, it's particular assumptions, tests of the assumption, etc., together in code, without having to modify the Context class and all of the code that uses it.
@jaron-lee could point about consistency. Maybe we can add a consistency test function to each assumption that inspects the rest of the context for contradictions. Most assumptions would probably start off with an empty test, but if we see people misusing them, we will have place to add code to catch those common errors, at least.
Also, if assumptions aren't used by a algorithm, we might be able to use some of them as validation checks afterwards. For example, if the user asserts a set of fixed edge constraints, and a particular CD algorithm ignores them, then we can have a refutation that looks at the fixed edges and asserts that the CD algorithm actually found them and warn if that assumption/assertion is being violated.
from dodiscover.
Another con regarding the "melting pot" of assumptions is that it would be quite difficult to know in general if all of the causal assumptions specified are even compatible or consistent with one another. I guess this would fall on the analyst to ensure that they have a valid assumption set.
Agreed. I suppose if we clearly update documentation while also adding tests between common assumptions, along with a user-friendly API (i.e. being able to pretty-print all the causal assumptions for easy inspection) is as good as we can get.
@jaron-lee could point about consistency. Maybe we can add a consistency test function to each assumption that inspects the rest of the context for contradictions. Most assumptions would probably start off with an empty test, but if we see people misusing them, we will have place to add code to catch those common errors, at least.
True, I guess there is no free-lunch here. We have to lose out on something. I think transparency and clear documentation is key here if we go this route. Perhaps we can discuss this on Monday if the others get a chance to see this.
Also, if assumptions aren't used by a algorithm, we might be able to use some of them as validation checks afterwards. For example, if the user asserts a set of fixed edge constraints, and a particular CD algorithm ignores them, then we can have a refutation that looks at the fixed edges and asserts that the CD algorithm actually found them and warn if that assumption/assertion is being violated.
+1
from dodiscover.
Related Issues (20)
- [DOC] Relevant in-depth tutorial on FCI
- [API] Long-term transition away from pandas into xarray, or something that supports structured NDarrays instead
- [BUG] R8 does a check for a tail incorrectly HOT 12
- Introduce pre-commit hooks HOT 3
- Implement ICD-Sep (ICDS) option in `SkeletonMethods`
- Repeated separating set testing as a subprocedure HOT 1
- Discrepancy between the output of pc algorithm in dodiscover and pgmpy graphs HOT 6
- Add optional caching of CI test values in constraint-based causal discovery
- Benchmarking PC/FCI algorithms HOT 1
- Add Chi-square CI tests and class of power-divergence tests
- Refactor base learn skeleton procedure
- [DOC] Enhance the documentation for topological order learning methods HOT 3
- NoGAM HOT 2
- [api] Modeling different types of interventions
- [API] Centralize common operations that are used to make `Context` objects initialize state for learning algos
- [API] Change `fit` into `learn_graph` keyword HOT 2
- Adding stability to v-structure and orientation rules - constraint based discovery
- GES for PAG algo
- Native Base NoTEARS implementation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dodiscover.