mailund / admixture_graph Goto Github PK
View Code? Open in Web Editor NEWModule for analysing admixture graphs
Module for analysing admixture graphs
Not all journals/proceedings accept colour figures -- and those that do often charge for them -- so it would be good if we could make plots of the data fits in gray scale or black and white.
Hi, looking at the source it seems that the Cholesky factor passed as the "concentration" argument in fit_graph
should be the upper, not lower, triangular factor. Can you please confirm? Thanks.
Hi Thomas,
I started using your R package; thanks for making admixture easier.
Sadly, when I try fitting graph to my f4 data I get a print:
"Something went wrong, trying again."
and I have no idea what's wrong. It prints it tens of times to conclude with error:
"Error: C stack usage 7969328 is too close to the limit".
For example, I am using your graph and replace leaf names
temp <- vector_to_graph(graphs_5_1[9, ])
new_names <- list(B = "OUT", A = "EU1650", D = "AN1950", C = "CH1850", E = "EU1750")
temp <- rename_nodes(temp, new_names)
And then try following F4s
W X Y Z D Z.value
1 OUT CH1850 AN1950 EU1650 0.0053 0.485
2 OUT CH1850 AN1950 EU1750 0.0987 11.552
3 OUT CH1850 AN1950 EU1850 0.0816 9.945
4 OUT CH1850 AN1950 EU1950 0.0359 4.395
5 OUT CH1850 AN1950 CH1950 0.0321 2.732
6 OUT CH1850 AN1950 Admix -0.0062 -0.697
7 OUT CH1850 EU1650 AN1950 -0.0053 -0.485
8 OUT CH1850 EU1650 EU1750 0.1057 9.860
9 OUT CH1850 EU1650 EU1850 0.0849 8.794
10 OUT CH1850 EU1650 EU1950 0.0314 2.801
11 OUT CH1850 EU1650 CH1950 0.0287 2.123
12 OUT CH1850 EU1650 Admix -0.0149 -1.260
To fit
f4fit <- fit_graph(f4ex, temp)
I don't know what am I missing, any pointers are welcome.
@KalleLeppala, when I run this code on the baboon data I get an error in the fitting code. Some months ago I could fit the graph but apparently not any longer. Do you have any idea what is going on?
leaves <- c("rheMac2","P.cynocephalus", "P.ursinus", "P.kindae", "P.hamadryas", "P.anubis", "P.papio")
inner_nodes <- c("R", "x", "y", "z", "w", "anc.kindae", "anc.ursinus", "anc.hamadryas", "anc.anubis", "anc.papio")
edges <- parent_edges(c(
edge("rheMac2", "R"),
edge("x", "R"),
edge("y", "x"),
edge("z", "w"),edge("w", "x"),
edge("P.kindae", "anc.kindae"),
admixture_edge("anc.kindae", "anc.ursinus", "w"), edge("anc.ursinus", "y"),
edge("P.cynocephalus", "y"),
edge("P.ursinus", "anc.ursinus"),
edge("P.hamadryas", "anc.hamadryas"), edge("anc.hamadryas", "z"),
edge("P.papio", "anc.papio"), edge("anc.papio", "z"),
edge("P.anubis", "anc.anubis"),
admixture_edge("anc.anubis", "anc.hamadryas", "anc.papio")
))
admixtures <- admixture_proportions(c(
admix_props("anc.kindae", "anc.ursinus", "w", "a"),
admix_props("anc.anubis", "anc.hamadryas", "anc.papio", "b")
))
kindae_admixed_graph_u3_anubis_papio_hamadryas <- agraph(leaves, inner_nodes, edges, admixtures)
plot(kindae_admixed_graph_u3_anubis_papio_hamadryas, show_admixture_labels = TRUE)
posf4 %>% filter_on_leaves(kindae_admixed_graph_u3_anubis_papio_hamadryas) %>%
fit_graph(kindae_admixed_graph_u3_anubis_papio_hamadryas, options) -> fit_kindae_admixed_graph_u3_anubis_papio_hamadryas
There are test files and test data at the top level directory which devtools::check() doesn't like.
Move tests to the test directory and data to the data directory after a bit of cleanup.
This is probably a bug that was introduced when I fixed the left/right drawing of admixture edges. Right now it means that it flips the admixture proportions when plotting.
Given a covariance matrix, rooted in an arbitrary leaf, as in Felsenstein chapter 23, fit the edge lengths to it. This just means fitting F2 statistics, essentially, and shouldn't be much different from what we do now with F4 and F3 statistics.
There shouldn't really be a need to provide the same information twice, but right now we duplicate all admixture edge specifications to provide them first for the edges and then for the admixture proportions.
Hello! I am trying to run fit_graph.
I have:
head(dstat)
W X Y Z D Z.value
1 gp1 gp2 gp3 gp4 0.0068 1.698
I essentially have all possible dstats.
and I use:
dsubsample<-filter_on_leaves(dstat,unadmix_graph)
graph_fit <- fit_graph(unadmix_graph, dsubsample )
As suggested by one of the issues, I still get:
Something went wrong, trying again.
Error: C stack usage 7969252 is too close to the limit
Any clue why?
Export to Patterson's qpGraph
format. An example can be found at https://github.com/DReichLab/AdmixTools/blob/master/examples.qpGraph/gr1x
I have found possible errors when exporting graph to qpGraphs when the graphs involved admixtures. If I do following:
export_to_qpGraph(f = "test.qpgraph",graph = vector_to_graph(graphs_3_1[1,]))
The test.qpgraph text will look like this:
root R
label A A
label B B
label C C
edge edge_inner_C_A inner_C A
edge edge_R_B R B
edge edge_admix_p1_C admix_p1 C
edge edge_inner_p1_inner_C inner_p1 inner_C
edge edge_R_inner_p1 R inner_p1
edge edge_inner_C_left_admix_p1 inner_C_left admix_p1
edge edge_inner_p1_right_admix_p1 inner_p1_right admix_p1
admix admix_p1 inner_C_left inner_p1_right 50 50
But there are two possible mistakes:
If I correct it to this, it will be compatible with qpGraph:
root R
label A A
label B B
label C C
edge edge_inner_C_A inner_C A
edge edge_R_B R B
edge edge_admix_p1_C admix_p1 C
edge edge_inner_p1_inner_C inner_p1 inner_C
edge edge_R_inner_p1 R inner_p1
edge edge_inner_p1_right_admix_p1 inner_p1 inner_p1_right
admix admix_p1 inner_C inner_p1_right 50 50
The generic plot function doesn't need to know "the order of leaves" but the graph plotting function does. This breaks the interface for that function. Find another way of providing that info.
Sometimes you end up with multiplication with NA and ending with "*".
I added a admixture to a graph using add_an_admixture2, getting a new list of graphs. But I found redudancy in this list, see below:
example_graph <- vector_to_graph(graphs_3_1[1, ])
example_list_3 <- add_an_admixture2(example_graph, "B")
plot(example_list_3[[1]])
plot(example_list_3[[2]])
I found the 1st and 2nd graph looked exactly the same. I compared the graph$parents dataframe, they were also identical. So are they redudant? if so, are their a way to keep only unique graphs in this list?
Make plots showing how various values for admixture proportions affect the fit.
When plotting, the admixture graphs tend to cross because they pick the ancestors based on the order in the nodes list, not based on if they are to the left or to the right. That should be easy enough to fix.
When the iteration count is exceeded we get a warning but there is no way to increase the iteration count. Shouldn't there be an option for doing that?
Pull out all the trees embedded in a graph.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.