mailund / admixture_graph Goto Github PK

View Code? Open in Web Editor NEW

28.0 5.0 12.0 8.42 MB

Module for analysing admixture graphs

R 100.00%

admixture_graph's People

Stargazers

Watchers

Forkers

kalleleppala digideskio tw7649116 guzhongru udaydatar7 hj1994412 kizbaolin priyamoorjani vallebueno hansonmenghan fupengfei941

admixture_graph's Issues

Black and white fitting plots

Not all journals/proceedings accept colour figures -- and those that do often charge for them -- so it would be good if we could make plots of the data fits in gray scale or black and white.

orientation of Cholesky factor

Hi, looking at the source it seems that the Cholesky factor passed as the "concentration" argument in fit_graph should be the upper, not lower, triangular factor. Can you please confirm? Thanks.

Fitting graph, "Something went wrong, trying again."

Hi Thomas,

I started using your R package; thanks for making admixture easier.

Sadly, when I try fitting graph to my f4 data I get a print:
"Something went wrong, trying again."
and I have no idea what's wrong. It prints it tens of times to conclude with error:
"Error: C stack usage 7969328 is too close to the limit".

For example, I am using your graph and replace leaf names

  temp <- vector_to_graph(graphs_5_1[9, ])
  new_names <- list(B = "OUT", A = "EU1650", D = "AN1950", C = "CH1850", E = "EU1750")
  temp <- rename_nodes(temp, new_names)

And then try following F4s

     W      X      Y      Z       D Z.value
1  OUT CH1850 AN1950 EU1650  0.0053   0.485
2  OUT CH1850 AN1950 EU1750  0.0987  11.552
3  OUT CH1850 AN1950 EU1850  0.0816   9.945
4  OUT CH1850 AN1950 EU1950  0.0359   4.395
5  OUT CH1850 AN1950 CH1950  0.0321   2.732
6  OUT CH1850 AN1950  Admix -0.0062  -0.697
7  OUT CH1850 EU1650 AN1950 -0.0053  -0.485
8  OUT CH1850 EU1650 EU1750  0.1057   9.860
9  OUT CH1850 EU1650 EU1850  0.0849   8.794
10 OUT CH1850 EU1650 EU1950  0.0314   2.801
11 OUT CH1850 EU1650 CH1950  0.0287   2.123
12 OUT CH1850 EU1650  Admix -0.0149  -1.260

To fit

f4fit <- fit_graph(f4ex, temp)

I don't know what am I missing, any pointers are welcome.

Bug with fitting

@KalleLeppala, when I run this code on the baboon data I get an error in the fitting code. Some months ago I could fit the graph but apparently not any longer. Do you have any idea what is going on?

leaves <- c("rheMac2","P.cynocephalus",  "P.ursinus", "P.kindae", "P.hamadryas", "P.anubis", "P.papio")
inner_nodes <- c("R", "x", "y", "z", "w", "anc.kindae", "anc.ursinus", "anc.hamadryas", "anc.anubis", "anc.papio")

edges <- parent_edges(c(
  edge("rheMac2", "R"),
  edge("x", "R"),
  edge("y", "x"),
  edge("z", "w"),edge("w", "x"),
  edge("P.kindae", "anc.kindae"), 
  admixture_edge("anc.kindae", "anc.ursinus", "w"), edge("anc.ursinus", "y"),
  edge("P.cynocephalus", "y"), 
  edge("P.ursinus", "anc.ursinus"), 
  edge("P.hamadryas", "anc.hamadryas"), edge("anc.hamadryas", "z"),
  edge("P.papio", "anc.papio"), edge("anc.papio", "z"),
  edge("P.anubis", "anc.anubis"),
  admixture_edge("anc.anubis", "anc.hamadryas", "anc.papio")
))
admixtures <- admixture_proportions(c(
  admix_props("anc.kindae", "anc.ursinus", "w", "a"),
  admix_props("anc.anubis", "anc.hamadryas", "anc.papio", "b")
))

kindae_admixed_graph_u3_anubis_papio_hamadryas <- agraph(leaves, inner_nodes, edges, admixtures)
plot(kindae_admixed_graph_u3_anubis_papio_hamadryas, show_admixture_labels = TRUE)

posf4 %>% filter_on_leaves(kindae_admixed_graph_u3_anubis_papio_hamadryas) %>%
  fit_graph(kindae_admixed_graph_u3_anubis_papio_hamadryas, options) -> fit_kindae_admixed_graph_u3_anubis_papio_hamadryas

Get rid of the top-level files

There are test files and test data at the top level directory which devtools::check() doesn't like.

Move tests to the test directory and data to the data directory after a bit of cleanup.

Sometimes the admixture proportions are plotted along the wrong edge

This is probably a bug that was introduced when I fixed the left/right drawing of admixture edges. Right now it means that it flips the admixture proportions when plotting.

Add a function that fits a graph to a covariance matrix

Given a covariance matrix, rooted in an arbitrary leaf, as in Felsenstein chapter 23, fit the edge lengths to it. This just means fitting F2 statistics, essentially, and shouldn't be much different from what we do now with F4 and F3 statistics.

Extract the admixture proportions parameter from the edges specification

There shouldn't really be a need to provide the same information twice, but right now we duplicate all admixture edge specifications to provide them first for the edges and then for the admixture proportions.

Something went wrong, trying again

Hello! I am trying to run fit_graph.

I have:

 head(dstat)
     W    X    Y    Z       D Z.value
1 gp1 gp2 gp3 gp4  0.0068   1.698

I essentially have all possible dstats.

and I use:


dsubsample<-filter_on_leaves(dstat,unadmix_graph) 

graph_fit <- fit_graph(unadmix_graph, dsubsample )

As suggested by one of the issues, I still get:


Something went wrong, trying again.
Error: C stack usage  7969252 is too close to the limit

Any clue why?

Export to qpGraph format

Export to Patterson's qpGraph format. An example can be found at https://github.com/DReichLab/AdmixTools/blob/master/examples.qpGraph/gr1x

possible mistakes in export_to_qpGraph

I have found possible errors when exporting graph to qpGraphs when the graphs involved admixtures. If I do following:

export_to_qpGraph(f = "test.qpgraph",graph = vector_to_graph(graphs_3_1[1,]))

The test.qpgraph text will look like this:

root R

label A A
label B B
label C C

edge edge_inner_C_A inner_C A
edge edge_R_B R B
edge edge_admix_p1_C admix_p1 C
edge edge_inner_p1_inner_C inner_p1 inner_C
edge edge_R_inner_p1 R inner_p1
edge edge_inner_C_left_admix_p1 inner_C_left admix_p1
edge edge_inner_p1_right_admix_p1 inner_p1_right admix_p1

admix admix_p1 inner_C_left inner_p1_right 50 50

But there are two possible mistakes:

we do not need edges to represent the admixture, it is conveyed using the "admix" line;
we are missing an edge connecting the "inner_p1" to the node "inner_p1_right" that contributes to the "admix_p1".

If I correct it to this, it will be compatible with qpGraph:

root R

label A A
label B B
label C C

edge edge_inner_C_A inner_C A
edge edge_R_B R B
edge edge_admix_p1_C admix_p1 C
edge edge_inner_p1_inner_C inner_p1 inner_C
edge edge_R_inner_p1 R inner_p1
edge edge_inner_p1_right_admix_p1 inner_p1 inner_p1_right

admix admix_p1 inner_C inner_p1_right 50 50

Fix plot() interface

The generic plot function doesn't need to know "the order of leaves" but the graph plotting function does. This breaks the interface for that function. Find another way of providing that info.

Canonise expressions doesn't work

Sometimes you end up with multiplication with NA and ending with "*".

redudant graphs with add_an_admixture2

I added a admixture to a graph using add_an_admixture2, getting a new list of graphs. But I found redudancy in this list, see below:

example_graph <- vector_to_graph(graphs_3_1[1, ])
example_list_3 <- add_an_admixture2(example_graph, "B")
plot(example_list_3[[1]])
plot(example_list_3[[2]])

I found the 1st and 2nd graph looked exactly the same. I compared the graph$parents dataframe, they were also identical. So are they redudant? if so, are their a way to keep only unique graphs in this list?

mailund / admixture_graph Goto Github PK

admixture_graph's People

Stargazers

Watchers

Forkers

admixture_graph's Issues

Recommend Projects

Recommend Topics

Recommend Org