tobiasmadsen / dgraph Goto Github PK
View Code? Open in Web Editor NEWDiscrete factor graphs in R
Discrete factor graphs in R
Using reingold.tilford as layout for igraph plot layout only works for trees and not for disconnected graphs.
A well known problem with Rcpp Modules.
http://lists.r-forge.r-project.org/pipermail/rcpp-devel/2014-June/007758.html
Easiest way to reproduce is:
E.g.
varDim <- 2
facPot <- list(matrix(0.5,1,2))
facNbs <- list(1)
mydfg <- dfg(varDim, facPot, facNbs)
save.image(".RData")
mydfg <- 2
gctorture()
load(".RData")
mydfg$dfgmodule$resetFactorPotentials( list(matrix(0.5,1,2)) )
Provide facScores argument directly instead of through the foreground model
Enable comparison of two factor graphs with similar structure, but different mappings of potentials. Comparisons are performed in both tailSaddle
, tailIS
and kl
.
Refactor the check of similar structure and generate smallest structure that captures both maps.
Use the potential generators in the optimize functions.
Call to sapply gives vector of length nrow(data) for data.frame but a vector of length nrow(data)*ncol(data) for matrices.
if( ! suppressWarnings( all( sapply(data, max, na.rm = T) <= dfg$varDim) ) )
stop("Data outside range")
We find the saddlepoint by Newton-Raphson, however by using the fact that kappa prime is monotonous, we can fall back on bisection if the Newton-Raphson procedure is diverging.
For continuous distribution some probabilities might be zero(numeric underflow) in foreground while non-zero in background causing spurious infinite kullback leibler divergence.
The copy function was made to ensure that each dfg object had it's own underlying module. The module is now built as needed( see #8) and the normal assign function can be used instead.
We get an underflow if a variable has a large number of neighbors. Normalization of messages is performed only each time a new message is computed, should also be normalized during computations.
library(dgRaph)
N <- 1000
varDim <- rep(100, N+1)
facPot <- list(matrix(0.01, 100, 100))
facNbs <- lapply(1:N, function(i){c(1,i+1)})
potMap <- rep(1, N)
dfg(varDim, facPot, facNbs, potMap)
data <- data.frame(matrix(rep(1,N+1),1,N+1))
likelihood(data, dfg = mydfg) # Produce NaN
The choice of alpha is hard to decide beforehand for user. Some ideas could be
Tail approximation with importance sampling (and possibly also saddlepoint approximation) gives wrong results with un-normalized foreground and/or backgrounds.
Upon installation on CentOS server
g++ -I/com/extra/R/3.1.0/lib64/R/include -DNDEBUG -DDEBUG -I/usr/local/include -I"/home/tm/R/x86_64-unknown-linux-gnu-library/3.1/Rcpp/include" -I"/home/tm/R/x86_64-unknown-linux-gnu-library/3.1/BH/include" -fpic -g -O2 -c DiscreteFactorGraph.cpp -o DiscreteFactorGraph.o
In file included from /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/array:35,
from DiscreteFactorGraph.h:21,
from DiscreteFactorGraph.cpp:6:
/usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/c++0x_warning.h:31:2: error: #error This file requires compiler and library support for the upcoming ISO C++ standard, C++0x. This support is currently experimental, and must be enabled with the -std=c++0x or -std=gnu++0x compiler options.
In file included from DiscreteFactorGraph.cpp:6:
DiscreteFactorGraph.h:148: error: ISO C++ forbids declaration of 'array' with no type
DiscreteFactorGraph.h:148: error: invalid use of '::'
DiscreteFactorGraph.h:148: error: expected ';' before '<' token
DiscreteFactorGraph.h:149: error: ISO C++ forbids declaration of 'array' with no type
DiscreteFactorGraph.h:149: error: invalid use of '::'
DiscreteFactorGraph.h:149: error: expected ';' before '<' token
DiscreteFactorGraph.cpp:835: error: expected constructor, destructor, or type conversion before '<' token
DiscreteFactorGraph.cpp:1207: error: expected '}' at end of input
Can be solved by adding --std=c++0x to PKG_CPPFLAGS in src/Makevars.
Could use bootstrapping?
With this commit d636c40 the option to calculate the normalising constant without rerunning the sumproduct-algorithm was removed. This was done primarily because those function would fail in case of disconnected graphs. At the same time we want to delegate the decision of whether sumProduct needs to be run to the sumproduct algorithm itself.
Possible solution:
Have a private boolean indicating if internal state(potentials) has changed since sumProduct was run last.
Complications:
MaxSum algorithm uses same set of messages to do calculations.
Identify other scenarios where sumProduct needs to be re-run.
The saddle point approximation method generates random NA
's. Reproducible example to come.
Our simulate hides stats::simulate.
library(dgRaph)
dat <- data.frame(x = 1:50, y = rnorm(50,1:50,2))
mylm <- lm(formula = y ~ x, data = dat)
simulate(mylm, nsim = 10) # Does not work
If soft evidence input using dataList does not have the correct length, we might see segfaults. More to come...
Call checkInterrupt in long running loops
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.