hdng / clonevol Goto Github PK

View Code? Open in Web Editor NEW

138.0 138.0 45.0 5.66 MB

Inferring and visualizing clonal evolution in multi-sample cancer sequencing

License: GNU General Public License v3.0

R 66.70% Shell 33.30%

clonevol's People

Contributors

Stargazers

Watchers

clonevol's Issues

Monoclonal Fishplot Failure

Hello,
I have similar data to the example dataset of clonevol, but I am not able to create a FishPlot as shown in the readme-file.
The problem is that the infer.clonal.models()-function scales low vaf-values so that they are in total 100% (see figure).

How can I change the infer.clonal.models()-function, to be able to create a FishPlot as in the example.

Thanks in advance.

error while running infer.clonal.models()

Hello,
I get the following error when running infer.clonal.models(). Can you please help me fix this? I have attached the input data.txt file for your reference.

x <- data.txt read here

vaf.col.names <- grep('.vaf', colnames(x), value=T)
sample.names <- gsub('.vaf', '', vaf.col.names)
x[, sample.names] <- x[, vaf.col.names]
vaf.col.names <- sample.names
sample.groups <- c("G","D","B","R")
names(sample.groups) <- vaf.col.names
x <- x[order(x$cluster),]
clone.colors <- NULL
pp <- variant.box.plot(x,
                       cluster.col.name = 'cluster',
                       show.cluster.size = FALSE,
                       cluster.size.text.color = 'blue',
                       vaf.col.names = vaf.col.names,
                       vaf.limits = 70,
                       sample.title.size = 20,
                       violin = FALSE,
                       box = FALSE,
                       jitter = TRUE,
                       jitter.shape = 1,
                       jitter.color = clone.colors,
                       jitter.size = 3,
                       jitter.alpha = 1,
                       jitter.center.method = 'median',
                       jitter.center.size = 1,
                       jitter.center.color = 'darkgray',
                       jitter.center.display.value = 'none',
                       highlight = 'is.driver',
                       highlight.note.col.name = 'gene',
                       highlight.note.size = 2,
                       highlight.shape =16,
                       order.by.total.vaf = FALSE
)

> y = infer.clonal.models( variants=x,
+                         cluster.col.name = 'cluster',
+                         vaf.col.names = vaf.col.names,
+                         sample.groups = sample.groups,
+                         subclonal.test = 'bootstrap',
+                         subclonal.test.model = 'non-parametric',
+                         num.boots = 1000,
+                         #founding.cluster = '1',
+                         #cluster.center = 'mean',
+                         #ignore.clusters = NULL,
+                         clone.colors = clone.colors,
+                         min.cluster.vaf = 0.01,
+                         sum.p = 0.05,
+                         alpha = 0.05,
+                         ignore.clusters=T)
Sample 1: G <-- G
Sample 2: D <-- D
Sample 3: B <-- B
Sample 4: R <-- R
Using monoclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
G : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 1,2,3 
User ignored clusters:   
G : 3 clonal architecture model(s) found

D : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 1,3 
User ignored clusters:   
D : 1 clonal architecture model(s) found

B : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 2,3 
User ignored clusters:   
B : 1 clonal architecture model(s) found

R : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 1,2 
User ignored clusters:   
R : 1 clonal architecture model(s) found

Finding consensus models across samples...
Found  3 consensus model(s)
Generating consensus clonal evolution trees across samples...
Error in merge.clone.trees(m, samples = samples, sample.groups, merge.similar.samples = merge.similar.samples) : 
  ERROR: Something wrong. No clones left after filter. They might have been excluded.

Thanks,
Gunjan

Error: 'createFishPlotObjects' is not an exported object from 'namespace:clonevol'

installed from github:
want to use with fishplot.

> clonevol::createFishPlotObjects
Error: 'createFishPlotObjects' is not an exported object from 'namespace:clonevol'

Thanks.

estimate.clone.vaf

I think I'm missing something really basic here but I can't make this function to work. it does not find it.

Error in estimate.clone.vaf(clonevol_model$variants, vaf.col.names = vaf.col.names) :
could not find function "estimate.clone.vaf"

Thanks in advance!

Input file from PhyloWGS

In Figure 1, it is written that output from PhyloWGS can be visualized in ClonEvol.

But which PhyloWGS output file(s) is needed?

Errors while plotting the tree

Hi Ha,

Archive.zip
Thanks for distributing the software. I appreciate your generosity.
I generated the input for the Clonevol using pyclone (attached here) and ran the scripts following your instructions until step 4 as shown below without outputting the errors (RData files is also attached).

y = infer.clonal.models(variants = x,
cluster.col.name = 'cluster',
vaf.col.names = vaf.col.names,
sample.groups = sample.groups,
cancer.initiation.model='monoclonal',
subclonal.test = 'bootstrap',
subclonal.test.model = 'non-parametric',
num.boots = 1000,
founding.cluster = 1,
cluster.center = 'mean',
ignore.clusters = NULL,
clone.colors = clone.colors,
min.cluster.vaf = 0.01,
sum.p = 0.05,
alpha = 0.05)
Sample 1: YLR107.P <-- YLR107.P
Sample 2: YLR107.IR1 <-- YLR107.IR1
Sample 3: YLR107.IR <-- YLR107.IR
Sample 4: YLR107.P1 <-- YLR107.P1
Sample 5: YLR107.IR2 <-- YLR107.IR2
Using monoclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
YLR107.P : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 5,0
YLR107.P : 6 clonal architecture model(s) found
YLR107.IR1 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 2,0,5,4
YLR107.IR1 : 1 clonal architecture model(s) found
YLR107.IR : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 4,0,3
YLR107.IR : 1 clonal architecture model(s) found
YLR107.P1 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 4,5,3
YLR107.P1 : 1 clonal architecture model(s) found
YLR107.IR2 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 2,0,4,5
YLR107.IR2 : 1 clonal architecture model(s) found
Finding consensus models across samples...
Found 1 consensus model(s)
Generating consensus clonal evolution trees across samples...
Found 1 consensus model(s)
Pruning consensus clonal evolution trees....
Seeding aware pruning is: off
Number of unique pruned consensus trees: 1
Scoring models...

Now when I tried to plot the trees, i am getting the errors that other users have been reporting which I could not fix even after following the suggestions from you.

plot.clonal.models(y,
box.plot = TRUE,
fancy.boxplot = TRUE,
fancy.variant.boxplot.highlight = 'is.driver',
fancy.variant.boxplot.highlight.shape = 21,
fancy.variant.boxplot.highlight.fill.color = 'red',
fancy.variant.boxplot.highlight.color = 'black',
fancy.variant.boxplot.highlight.note.col.name = 'gene',
fancy.variant.boxplot.highlight.note.color = 'blue',
fancy.variant.boxplot.highlight.note.size = 2,
fancy.variant.boxplot.jitter.alpha = 1,
fancy.variant.boxplot.jitter.center.color = 'grey50',
fancy.variant.boxplot.base_size = 12,
fancy.variant.boxplot.plot.margin = 1,
fancy.variant.boxplot.vaf.suffix = '.VAF',
clone.shape = 'bell',
bell.event = TRUE,
bell.event.label.color = 'blue',
bell.event.label.angle = 60,
clone.time.step.scale = 1,
bell.curve.step = 2,
merged.tree.plot = TRUE,
tree.node.label.split.character = NULL,
tree.node.shape = 'circle',
tree.node.size = 30,
tree.node.text.size = 0.5,
merged.tree.node.size.scale = 1.25,
merged.tree.node.text.size.scale = 2.5,
merged.tree.cell.frac.ci = FALSE,
merged.tree.clone.as.branch = TRUE,
mtcab.event.sep.char = ',',
mtcab.branch.text.size = 1,
mtcab.branch.width = 0.75,
mtcab.node.size = 3,
mtcab.node.label.size = 1,
mtcab.node.text.size = 1.5,
cell.plot = TRUE,
num.cells = 100,
cell.border.size = 0.25,
cell.border.color = 'black',
clone.grouping = 'horizontal',
scale.monoclonal.cell.frac = TRUE,
show.score = FALSE,
cell.frac.ci = TRUE,
disable.cell.frac = FALSE,
out.dir = 'output',
out.format = 'pdf',
overwrite.output = TRUE,
width = 8,
height = 4,
panel.widths = c(3,4,2,4,2))
Error in x0[i] <- x1[which(x$branches == parent)] :
replacement has length zero
pdf('trees.pdf', width = 3, height = 5, useDingbats = FALSE)
plot.all.trees.clone.as.branch(y, branch.width = 0.5,
node.size = 1, node.label.size = 0.5)
Error in x0[i] <- x1[which(x$branches == parent)] :
replacement has length zero
dev.off()

I would appreciate it if you can spend time looking into my data and suggest how to fix it.
Please let me know if you need additional information.

Best,
Jungmin Choi

Pyclone, cellular_prevalence, variant_allele_frequency

Dear Ha X. Dang,

Thanks for the wonderful tool. As you said, clonevol can use the results of Pyclone as input file.

The results of Pyclone include "cellular_prevalence" and "variant_allele_frequency", I wonder are these two results are the CCF and VAF as input of clonevol?

Looking forward to your reply, thanks very much!

Best,

Zorro
2018.06.02

error input from ClonEvol for fishplot

Hello,
I get the following error when running generateFishplotInputs(). Can you please help me fix this? I have attached the input data.txt file for your reference.

f = generateFishplotInputs(results=y)
Error in $<-.data.frame(tmp, "parent", value = numeric(0)) :
replacement has 0 rows, data has 1

this is my code:

library(clonevol)
library(fishplot)
ZHQ.ClonEvol.input.txt
x <- read.table("ZHQ.ClonEvol.input.txt",header = T,sep = "\t")
#x <- aml1$variants

#shorten vaf column names as they will be
vaf.col.names <- grep('.vaf', colnames(x), value=T)
sample.names <- gsub('.vaf', '', vaf.col.names)

x[, sample.names] <- x[, vaf.col.names]
vaf.col.names <- sample.names

#prepare sample grouping
#sample.groups <- c('P', 'R');
sample.groups <- vaf.col.names
names(sample.groups) <- vaf.col.names

#setup the order of clusters to display in various plots (later)
x <- x[order(x$cluster),]

clone.colors <- c('#999793', '#8d4891', '#f8e356', '#fe9536', '#d7352e')
y = infer.clonal.models(variants = x,
cluster.col.name = 'cluster',
vaf.col.names = vaf.col.names,
sample.groups = sample.groups,
cancer.initiation.model='monoclonal',
subclonal.test = 'bootstrap',
subclonal.test.model = 'non-parametric',
num.boots = 1000,
founding.cluster = 1,
cluster.center = 'mean',
ignore.clusters = NULL,
clone.colors = clone.colors,
min.cluster.vaf = 0.01,
#min probability that CCF(clone) is non-negative
sum.p = 0.05,
#alpha level in confidence interval estimate for CCF(clone)
alpha = 0.05)

y <- transfer.events.to.consensus.trees(y,
x[x$is.driver,],
cluster.col.name = 'cluster',
event.col.name = 'gene')

y <- convert.consensus.tree.clone.to.branch(y, branch.scale = 'sqrt')

f = generateFishplotInputs(results=y)
fishes = createFishPlotObjects(f)
#plot with fishplot
pdf('fish.pdf', width=8, height=5)
for (i in 1:length(fishes)){
fish = layoutClones(fishes[[i]])
fish = setCol(fish,f$clonevol.clone.colors)
fishPlot(fish,shape="spline", title.btm="Patient", cex.title=0.5,
vlines=seq(1, length(vaf.col.names)), vlab=vaf.col.names, pad.left=0.5)
}
dev <- dev.off()

CCF values in AML1 dataset

Hi,
I tried to run ClonEvol with the tutorial and I have a question about the cancer cell fraction (CCF) values from the aml1 dataset. I understood that it corresponds to the fraction of tumor cells with the mutation, and that in the case of diploid heterozygous variants the CCF can be calculated as twice of its VAF.
I noticed that in the P.ccf column the values are between 0 and 129.58.
Is it relevant to have a CCF value greater than 100? I am not familiar with this type of calculations (yet) and it would be very helpful if you could explain it to me.
Thank you !

Issue infer.clonal.models function

Dear Ha X. Dang,

I am analysing four different spatial samples from cancer. I could reach to run clonevo until the function called infer.clonal.models and could not find any consensus models across samples...
Could you try to help me to find what I made wrong or what could I do to obtain the proper models?

---- Below you will find the code used ----

I ran Sciclone and I obtained a file with the following header --> chr st sci_EPC3205_01.ref sci_EPC3205_01.var sci_EPC3205_01.vaf sci_EPC3205_01.cn sci_EPC3205_01.cleancn sci_EPC3205_01.depth sci_EPC3205_02.ref sci_EPC3205_02.var sci_EPC3205_02.vaf sci_EPC3205_02.cn sci_EPC3205_02.cleancn sci_EPC3205_02.depth sci_EPC3205_03.ref sci_EPC3205_03.var sci_EPC3205_03.vaf sci_EPC3205_03.cn sci_EPC3205_03.cleancn sci_EPC3205_03.depth sci_EPC3205_04.ref sci_EPC3205_04.var sci_EPC3205_04.vaf sci_EPC3205_04.cn sci_EPC3205_04.cleancn sci_EPC3205_04.depth adequateDepth cluster cluster.prob.1 cluster.prob.2 cluster.prob.3 cluster.prob.4 cluster.prob.5 cluster.prob.6 cluster.prob.7 cluster.prob.8 cluster.prob.9

library(clonevol)

library(devtools)

sample1 <- "EPC3205_01"

sample2 <-"EPC3205_02"

sample3 <- "EPC3205_03"

sample4 <- "EPC3205_04"

iinformation <- "EPC3205_minlen20" --> output file from sciclone (clusters)

patient <- "EPC3205"

numcolorsqtt <- 9

v1 <- read.delim2(file=iinformation, header=TRUE, sep="\t", dec=".", stringsAsFactors = FALSE)
-- Save the file as a dataframe for a easy treat, delete NA's
ff<- v1
delete.na <- function(DF, n=0) {
DF[rowSums(is.na(DF)) <= n,]
}
ffy <- delete.na(ff)
x<-ffy

vaf.col.names <- grep('.vaf', colnames(x), value=T)

sample.names <- gsub('.vaf', '', vaf.col.names)

x[, sample.names] <- x[, vaf.col.names]

vaf.col.names <- sample.names

sample.groups <- c(sample1, sample2, sample3 , sample4)

names(sample.groups) <- vaf.col.names

x <- x[order(x$cluster),]

library("colorspace")

clone.colors <- sequential_hcl(numcolorsqtt,palette = "Blue-Yellow")

pdf(paste0(patient,'_cluster_minlength_20.pdf'), useDingbats = FALSE, title='cluster_minlength_20')

pp <- plot.variant.clusters(x,
cluster.col.name = 'cluster',
show.cluster.size = FALSE,
cluster.size.text.color = 'blue',
vaf.col.names = vaf.col.names,
vaf.limits = 70,
sample.title.size = 8,
violin = FALSE,
box = T,
jitter = TRUE,
jitter.shape = 1,
jitter.color = clone.colors,
jitter.size = 1,
jitter.alpha = 1,
jitter.center.method = 'median',
jitter.center.size = 1,
jitter.center.color = 'darkgray',
jitter.center.display.value = 'none',
highlight.shape = 21,
highlight.color = 'blue',
highlight.fill.color = 'green',
highlight.note.col.name = 'gene',
highlight.note.size = 2,
order.by.total.vaf = FALSE)

dev.off()

-------------------- console output --------------------

Warning messages:

1: fun.y is deprecated. Use fun instead.

2: fun.y is deprecated. Use fun instead.

3: fun.y is deprecated. Use fun instead.

4: fun.y is deprecated. Use fun instead.

5: Removed 15 rows containing non-finite values (stat_summary).

6: Removed 15 rows containing non-finite values (stat_boxplot).

7: Removed 15 rows containing missing values (geom_point).

8: Removed 1 rows containing non-finite values (stat_summary).

9: Removed 1 rows containing non-finite values (stat_boxplot).

10: Removed 1 rows containing missing values (geom_point).

plot.pairwise(x, col.names = vaf.col.names, out.prefix = paste0(patient,'_variants_minlength20.plot'), colors = clone.colors)

-------------------- console output --------------------

Warning messages:

1: Removed 15 rows containing missing values (geom_point).

2: Removed 15 rows containing missing values (geom_point).

3: Removed 16 rows containing missing values (geom_point).

4: Removed 1 rows containing missing values (geom_point).

5: Removed 1 rows containing missing values (geom_point).

pdf(paste0(patient,'_flow_minlength_20.pdf'),width=10, height=5, useDingbats=FALSE, title='flow_minlength_20.pdf')

plot.cluster.flow(x, vaf.col.names = vaf.col.names,
sample.names = c(sample1, sample2, sample3 , sample4), colors = clone.colors)

dev.off()

y = infer.clonal.models(variants = x, cluster.col.name = 'cluster', vaf.col.names = vaf.col.names, sample.groups = sample.groups, vaf.in.percent = TRUE, cancer.initiation.model='monoclonal', subclonal.test = 'bootstrap', subclonal.test.model = 'non-parametric', num.boots = 1000, founding.cluster = 1,cluster.center = 'mean', ignore.clusters = NULL,merge.similar.samples = TRUE, clone.colors = clone.colors, min.cluster.vaf = 0.01,seeding.aware.tree.pruning = FALSE, sum.p = 0.05,alpha = 0.05)

-------------------- console output --------------------

Sample 1: sci_EPC3205_01 <-- sci_EPC3205_01

Sample 2: sci_EPC3205_02 <-- sci_EPC3205_02

Sample 3: sci_EPC3205_03 <-- sci_EPC3205_03

Sample 4: sci_EPC3205_04 <-- sci_EPC3205_04

Using monoclonal model

Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
sci_EPC3205_01 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 5,9
sci_EPC3205_01 : 48 clonal architecture model(s) found

sci_EPC3205_02 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 2,7
sci_EPC3205_02 : 48 clonal architecture model(s) found

sci_EPC3205_03 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 3,6
sci_EPC3205_03 : 44 clonal architecture model(s) found

sci_EPC3205_04 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 4,7,9
sci_EPC3205_04 : 84 clonal architecture model(s) found

Finding consensus models across samples...
Found 0 consensus model(s)
Found 0 consensus model(s)
Scoring models...
Pruning consensus clonal evolution trees....
Seeding aware pruning is: off
Number of unique pruned consensus trees: 0

Also, I realized that the following plots have gene names, I have to input this data during the sciclone procedure, or there is a way for adds the information now?
Moreover, the gene names are a list of driver genes detected or a list of all variants that are located within genes?

Really thanks for your time and help,
Michelle.

ERROR: No clonal models for sample

Hello,
I get the following error when running infer.clonal.models(). Can you please help me fix this?

ERROR: No clonal models for sample: .
Check data or remove this sample, then re-run.

The input include information for 4 samples.

Data visualization when inferring a polyclonal model

Dear ClonEvol team & community
I have an issue when visualizing the ClonEvol data, could you please give me some suggestion to solving these problem below? I already attached my results in the forum

How to remove gene name of cluster 0 when inferring a polyclonal model? (The box plot model on the left)
What does the length of each cluster in branch-based tree model means? also when inferring a polyclonal model, was the length of cluster 0 meaningful? (the blue line). For examples, the length based cancer initiation time? frequency? or it’s just an uncertainty line to fit a length of model? (The branch-based tree model on the right)

Thank you in advance for your help

infer.clonal.models - bug with alternate cluster column names

In the function infer.clonal.models(), there is the option to specify cluster.col.name (default = 'cluster'), which I've been naming 'cluster_id'. But in the code where infer.clonal.models() calls on generate.boot, cluster.col.name isn't specified and then defaults to 'cluster':

        else if (subclonal.test == "bootstrap") {
            if (is.null(boot)) {
                boot = generate.boot(variants, vaf.col.names = vaf.col.names, 
                  depth.col.names = depth.col.names, vaf.in.percent = vaf.in.percent, 
                  num.boots = num.boots, bootstrap.model = subclonal.test.model, 
                  cluster.center.method = cluster.center, weighted = weighted, 
                  random.seed = random.seed)

Leading to this error:

Error in generate.boot(variants, vaf.col.names = vaf.col.names, depth.col.names = depth.col.names,  : 
  Input error: cluster column name does not appear in variants file.

I think the cluster.col.name=cluster.col.name option is missing from generate.boot()? Changing my column name to the default 'cluster' fixes this error.

Thanks for a very useful package!

Error in Generating consensus clonal evolution trees across samples

Hello,
When I run infer.clonal.models, an error occurred.

y = infer.clonal.models(variants = data,
cluster.col.name = 'cluster',
vaf.col.names = vaf.col.names,
ccf.col.names = ccf.col.names,
sample.groups = sample.groups,
cancer.initiation.model='monoclonal',
subclonal.test = 'bootstrap',
subclonal.test.model = 'non-parametric',
num.boots = 1000,
random.seed = 1234567,
founding.cluster = 1,
cluster.center = 'median',
ignore.clusters = c(2,3,4),
clone.colors = clone.colors,
min.cluster.vaf = NULL,
sum.p = 0.05,
alpha = 0.05)

Finding consensus models across samples...
Found 1 consensus model(s)
Generating consensus clonal evolution trees across samples...
Error in aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...) :
no rows to aggregate

Here is the input data file of infer.clonal.models, where vaf.col.names=c("M1.vaf" ,"M2.vaf" "P.vaf"), ccf.col.names=c("M1.ccf" ,"M2.ccf" "P.ccf"), sample.groups=c("M","M","P"), names of sample.groups=c("M1","M2","P").
P is primary tumor, both M1 and M2 are metastasis tumor. P and M are from the same patient.

example_data.txt

Thank you very much!

Question about convert.consensus.tree.clone.to.branch error

Hello

I am trying to use clonevol in my data, I have been succeed in the function infer.clonal.models , then I faced some errors when I used the function convert.consensus.tree.clone.to.branch , the following is the error messages.

Error in $<-.data.frame(*tmp*, "samples.with.nonzero.cell.frac", value = character(0)) :
replacement has 0 rows, data has 4

The following is my command (sorry for the bad layout )

library(clonevol)
loci11<- read.table("02844190/02844190_c1_2.tsv",header = TRUE)
loci13<-loci11[which(loci11$cluster!='NA'),]
colnames(loci13)[3]<-"variant_allele_frequency"
vaf.col.names<- "P"
loci13["P"]<-loci13$variant_allele_frequency

y = infer.clonal.models(variants = loci13,
cluster.col.name = 'cluster',
vaf.col.names = vaf.col.names,
# sample.groups = sample.groups,
subclonal.test = 'bootstrap',
subclonal.test.model = 'non-parametric',
num.boots = 1000,
founding.cluster = '1',
cluster.center = 'mean',
ignore.clusters = NULL,
clone.colors = clone.colors,
min.cluster.vaf = 0.01,
sum.p = 0.05,
alpha = 0.05)

y <- convert.consensus.tree.clone.to.branch(y, branch.scale = "sqrt")

plot.clonal.models(y,
# box plot parameters
# box.plot = TRUE,
# fancy.boxplot = FALSE,
# fancy.variant.boxplot.highlight = 'is.driver',
# fancy.variant.boxplot.highlight.shape = 21,
# fancy.variant.boxplot.highlight.fill.color = 'red',
# fancy.variant.boxplot.highlight.color = 'black',
# fancy.variant.boxplot.highlight.note.col.name = 'gene',
# fancy.variant.boxplot.highlight.note.color = 'blue',
# fancy.variant.boxplot.highlight.note.size = 2,
# fancy.variant.boxplot.jitter.alpha = 1,
# fancy.variant.boxplot.jitter.center.color = 'grey50',
# fancy.variant.boxplot.base_size = 12,
# fancy.variant.boxplot.plot.margin = 1,
# fancy.variant.boxplot.vaf.suffix = '.VAF',
# bell plot parameters
clone.shape = 'bell',
bell.event = TRUE,
bell.event.label.color = 'blue',
bell.event.label.angle = 60,
clone.time.step.scale = 1,
bell.curve.step = 2,
# node-based consensus tree parameters
merged.tree.plot = TRUE,
tree.node.label.split.character = NULL,
tree.node.shape = 'circle',
tree.node.size = 30,
tree.node.text.size = 0.5,
merged.tree.node.size.scale = 1.25,
merged.tree.node.text.size.scale = 2.5,
merged.tree.cell.frac.ci = FALSE,
# branch-based consensus tree parameters
merged.tree.clone.as.branch = TRUE,
mtcab.event.sep.char = ',',
mtcab.branch.text.size = 1,
mtcab.branch.width = 0.75,
mtcab.node.size = 3,
mtcab.node.label.size = 1,
mtcab.node.text.size = 1.5,
# cellular population parameters
cell.plot = TRUE,
num.cells = 100,
cell.border.size = 0.25,
cell.border.color = 'black',
clone.grouping = 'horizontal',
#meta-parameters
scale.monoclonal.cell.frac = TRUE,
show.score = FALSE,
cell.frac.ci = TRUE,
disable.cell.frac = FALSE,
# output figure parameters
out.dir = '02844190_1',
out.format = 'pdf',
overwrite.output = TRUE,
width = 8,
height = 4,
# vector of width scales for each panel from left to right
panel.widths = c(4,1,3,1))

My data:
02844190.zip

Can you help me to find out why it failed? Many thanks.

NA error

Hi!

I 've recently used SciClone for a sample pair of primary and relapse. In the output table, the are some NA values which, as far as I know, correspond to mutations that are not shared between samples.

Here is an example of the data frame:

    chr       st primary.ref primary.var primary.vaf primary.cn
100   1 56790773           0           0        0.00         NA
101   1 57427557          39          18       31.58          2
102   1 58035059          22           9       29.03          2
    primary.cleancn primary.depth relapse.ref relapse.var relapse.vaf
100              NA             0          47          19       28.79
101               2            57          42          18       30.00
102               2            31          29           8       21.62
    relapse.cn relapse.cleancn relapse.depth adequateDepth cluster
100          2               2            66             0      NA
101          2               2            60             1       2
102          2               2            37             1       2
    cluster.prob.1 cluster.prob.2
100             NA             NA
101   0.0156438858      0.9843561
102   0.0002460844      0.9997539

If I try to use infer.clonal.models with these results as:

>df
    cluster primary.vaf primary.depth relapse.vaf relapse.depth
100      NA        0.00             0       28.79            66
101       2       31.58            57       30.00            60
102       2       29.03            31       21.62            37

x <- infer.clonal.models(variants=df,
                         cluster.col.name="cluster",
                         vaf.col.names=vaf.col.names,
                         subclonal.test="none",
                         subclonal.test.model="none",
                         cluster.center="mean",
                         model = 'monoclonal',
                         vaf.in.percent = TRUE,
                         founding.cluster=1,
                         min.cluster.vaf=0.01,
                         p.value.cutoff=0.05)

I got the following error:

Sample 1: primary.vaf <-- primary.vaf
Sample 2: relapse.vaf <-- relapse.vaf
Using monoclonal model
primary.vaf : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: NA,NA 
Error in if (v[i, ]$excluded) { : missing value where TRUE/FALSE needed

Therefore I removed NA:

> df <- na.omit(df)

And I run again and got another error:

Sample 1: primary.vaf <-- primary.vaf
Sample 2: relapse.vaf <-- relapse.vaf
Using monoclonal model
primary.vaf : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters:  
primary.vaf : 1 clonal architecture model(s) found

relapse.vaf : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters:  
relapse.vaf : 1 clonal architecture model(s) found

Finding matched clonal architecture models across samples...
Found  1 compatible model(s)
Merging clonal evolution trees across samples...
Error in ci$sample.with.cell.frac.ci[cia$is.zero.cell.frac] = paste0("°",  : 
  NAs are not allowed in subscripted assignments

I would be grateful If you could help me solve this error.
A part from that, after running SciClone, is it recommended to do subclonal test with bootstrapping? What is the point of running clonevol as subclonal.test="bootstrap" and subclonal.test.model="non-parametric"?

If the idea is to run fishplot after clonevol, should I use rescale.vaf function? How?

Thank you in advance!

Interpretations of results from infer.clonal.model

Hello,

I am a new user for clonevol package, which provides a set of very useful analyses and visualizations for clonal evolution. I tried to find the explanations of results from the functions of infer.clonal.model, which produces a long list of results. For an example,

names(clone_mod_t6_7_8_9)
[1] "models" "matched" "num.matched.models" "num.pruned.trees" "variants"
[6] "params"

I guessed that “models” is about the individual samples. “matched” is about the merged models across samples. Then, for the list of results from “matched”, where could I find some explanations of them?

names(clone_mod_t6_7_8_9$matched)
[1] "index" "merged.trees" "merged.traces"
[4] "scores" "probs" "clone.ccf.pvalues"
[7] "trimmed.merged.trees" "trimmed.merged.trees.map"

For examples, “sample.with.cell.frac.ci” or “sample.with.nonzero.cell.frac.ci” is labeled in the bell plot. What are exact meaning for them?

Any help is highly appreciated.

Best,

Hui

`show_guide` has been deprecated. Please use `show.legend` instead.

Hi,
I am getting a deprecation error (show_guide has been deprecated. Please use show.legend instead.) when I run the plot.variant.clusters function.

infer.clonal.models() no rows to aggregate

Dear Dang,

When I run the infer.clonal.models() for parents order, I meet a problem.
I reads the old issues about this question, but the method all not worked.
===============Error:==================
Sample 1: Pre1 <-- Pre1
Sample 2: Pre2 <-- Pre2
Sample 3: Post <-- Post
Using monoclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
Pre1 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 6
Pre1 : 10 clonal architecture model(s) found

Pre2 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 5
Pre2 : 22 clonal architecture model(s) found

Post : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 3,5
Post : 7 clonal architecture model(s) found

Finding consensus models across samples...
Found 4 consensus model(s)
Generating consensus clonal evolution trees across samples...
Error in aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...): no rows to aggregate
Traceback:

infer.clonal.models(variants = x, cluster.col.name = "cluster",
. vaf.col.names = vaf.col.names, sample.groups = sample.groups,
. sample.names = NULL, cancer.initiation.model = "monoclonal",
. subclonal.test = "bootstrap", subclonal.test.model = "non-parametric",
. num.boots = 1000, founding.cluster = 1, cluster.center = "median",
. ignore.clusters = NULL, clone.colors = clone.colors, min.cluster.vaf = 0.01,
. sum.p = 0.05, alpha = 0.05)
find.matched.models(vv, sample.names, sample.groups, merge.similar.samples = merge.similar.samples)
merge.clone.trees(m, samples = samples, sample.groups, merge.similar.samples = merge.similar.samples)
aggregate(sample.group ~ ., cgrp, paste, collapse = ",")
aggregate.formula(sample.group ~ ., cgrp, paste, collapse = ",")
aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...)
stop("no rows to aggregate")

==============my Code==============
library(clonevol)
x=read.table("cluster2_rmNA.xls",head=T,sep="\t")
head(x)
vaf.col.names <- grep('.vaf', colnames(x), value=T)
vaf.col.names
sample.names <- gsub('.vaf', '', vaf.col.names)
sample.names
x[, sample.names] <- x[, vaf.col.names]
head(x)
x <- x[order(x$cluster),]
head(x)
sample.groups <- c('Pre1','Pre2','Post')
vaf.col.names = c('Pre1','Pre2','Post')
sample.names <- c('Pre1','Pre2','Post')
y = infer.clonal.models(variants = x,
cluster.col.name = 'cluster',
vaf.col.names = vaf.col.names,
sample.groups = sample.groups,
sample.names = NULL,
cancer.initiation.model='monoclonal',
subclonal.test = 'bootstrap',
subclonal.test.model = 'non-parametric',
num.boots = 1000,
founding.cluster = 1,
cluster.center = 'median',
ignore.clusters = NULL,
clone.colors = clone.colors,
min.cluster.vaf = 0.01,
sum.p = 0.05,
alpha = 0.05)

===============My data is ===================
cluster gene Pre1.vaf Pre2.vaf Post.vaf
2 gene2 34.58 22.54 12.83
2 gene3 41.73 26.79 21.9
2 gene4 37.86 23.11 12.95
2 gene5 34.81 18.35 19.65
3 gene6 37.85 8.27 0
3 gene7 34.65 14.29 0
2 gene8 44.05 18.94 24.19
6 gene9 0 9.92 7.66
6 gene10 0 10.24 2.31
3 gene11 35.05 6.97 0
3 gene12 34.39 7.72 0
2 gene13 36.14 25.83 18.31
4 gene14 22.07 20.34 21.56
6 gene15 0 14.92 24.37
3 gene16 26.83 5.36 0
6 gene17 0 17.16 16.67
2 gene18 33.33 28.08 18.75
5 gene19 8.33 0 0
6 gene20 0 4.49 3.83
2 gene21 30.65 25.89 19.33
6 gene22 0 7.51 0.6
2 gene23 34.1 29.75 28.25
2 gene24 35.44 28.1 30.36
2 gene25 35.89 27.5 36.92
6 gene26 0 8.25 0
1 gene27 46.03 33.33 24.39
2 gene28 40.8 30.04 28.43
6 gene29 0 8.11 12.24
3 gene30 41 6.97 0
3 gene31 33.33 4.68 0
4 gene32 15.29 21.09 11.89
1 gene33 70.76 45.48 21.32
1 gene34 85.47 71.12 61.57
6 gene35 0 9.49 1.89
2 gene36 35.43 27.6 16.94
6 gene37 0 13.77 0
6 gene38 0 9 0
3 gene39 34.5 8.44 0
2 gene40 34.17 17.65 20.97
3 gene41 33.82 10.22 0
3 gene42 34.69 8.5 1.16
2 gene43 39.9 29.17 24.16
2 gene44 33.98 23.36 22.12
6 gene45 0 11.54 0
4 gene46 15.76 18.67 15.79
2 gene47 23.78 26.29 22
1 gene48 46.12 35.94 28.08
3 gene49 19.26 6.4 0
4 gene50 16.32 10.42 12.12
3 gene51 34.12 5.95 0
6 gene52 0 10.93 2.5
3 gene53 36.43 6.37 0
2 gene54 32.97 22.14 20.41
6 gene55 0.31 7.48 14.06
2 gene56 34.72 32.35 17.65
2 gene57 36.22 19.23 16.87
3 gene58 41.3 10.34 0
5 gene59 14.89 0 0
2 gene60 35.61 26.32 24.19
1 gene61 41.29 32.45 45.61
6 gene62 0 0 14.29
2 gene63 42.17 29.88 14.02
3 gene64 33.63 8.21 0
2 gene65 36.54 25.36 16.99
2 gene66 35.2 20.47 15.2
3 gene67 38.18 7.99 1.6
3 gene68 34.43 9.51 0
2 gene69 34.29 24.83 20.21
3 gene70 32.56 5.43 0
3 gene71 37.25 8.12 0.49
2 gene72 43.55 25 31.54
2 gene73 29.86 23.54 26.18
2 gene74 32.31 24.28 9.05
2 gene75 36.61 27.3 27.32
2 gene76 31.39 27.27 29.72
1 gene77 51.15 37.96 39.85
6 gene78 0 16.87 0
6 gene79 0 7.51 0
3 gene80 30.71 11.7 0
2 gene81 36.31 18.33 19.51
4 gene82 9.04 10.98 12.76
2 gene83 33.54 32.58 27.11
2 gene84 36.23 21.19 20.31
2 gene85 30.81 18.72 22.73
2 gene86 32.11 23.4 36.8
2 gene87 32.89 28.57 29.09
3 gene88 44.35 9.55 0.72
3 gene89 36.88 8.16 0
3 gene90 38.64 7.69 0
1 gene91 45.83 26.32 46.42
2 gene92 33.79 24.89 13.97
2 gene93 36.19 21.76 32.14
2 gene94 33.48 26.43 24.5
2 gene95 34.8 26.18 20.66
2 gene96 31.69 25 21.38
2 gene97 36.59 29.48 23.19
2 gene98 38.48 33.42 22.55
3 gene99 32.67 11.76 0
2 gene100 29.1 27.65 10.95
2 gene101 40.4 24.88 29.94
2 gene102 35.43 24.12 27.78
2 gene103 40.34 31.47 33.82
3 gene104 43.43 3.03 0.27
6 gene105 0 7.45 3.17
2 gene106 36.24 30.67 15.53
2 gene107 43.43 25.7 14.38
2 gene108 38.15 26.7 15.89
2 gene109 37.05 30.28 29.21
2 gene110 40.19 25.24 17.2
2 gene111 35.35 23.1 19.02
6 gene112 0 7.23 23.08
2 gene113 32.42 22.02 18.64
2 gene114 28.33 24.4 20.98
2 gene115 34.13 18.9 18.15
3 gene116 39.09 8.56 0
2 gene117 39.01 25.07 19.05
3 gene118 40.4 7.55 0.4
2 gene119 34.57 23.68 38.15
6 gene120 0.49 12.32 4.2
2 gene121 30.67 24.16 13.27
2 gene122 32.26 24.66 25.27
2 gene123 34.65 23.43 14.09
6 gene124 0 11.6 8.06
2 gene125 31.78 20 19.87
2 gene126 30.08 29.67 25.17
2 gene127 41.04 22.64 16.44
2 gene128 38.18 25.4 15.96
5 gene129 14.46 0 0.36
2 gene130 33.68 26.42 13.47
2 gene131 38.17 33.26 26.13
2 gene132 33.33 21.47 22.73
2 gene133 44.14 27.06 25.71
2 gene134 32.74 19.82 24.82
2 gene135 35.04 25 20.85
2 gene136 34.3 25.29 20.42
2 gene137 39.11 25.14 18.69
2 gene138 32.62 24 12.9
6 gene139 0 6.22 13.62
6 gene140 0 0 11.59
2 gene141 32.83 22.29 12.14
3 gene142 34.19 5.92 0
2 gene143 32.2 28.49 21.1
5 gene144 11.14 0 0
3 gene145 29.3 6.63 0
5 gene146 12.46 0 0
2 gene147 30.91 28.74 19.11
6 gene148 0 8.15 2.33
2 gene149 30.63 29 17.8
2 gene150 32.9 28 25.74
2 gene151 30.66 36.36 28.8
2 gene152 35.71 19.88 23.45
2 gene153 32.89 27.85 20.28
3 gene154 29.32 9.33 0
2 gene155 35.18 23.47 26.24
2 gene156 34.1 27.74 32.72
1 gene157 54.76 33.7 25.95
1 gene158 57.6 46.41 27.19
1 gene159 59.24 29.26 33.13
1 gene160 35.92 40.91 24.37
3 gene161 59.69 12.26 0
2 gene162 37.06 26.97 20.74
2 gene163 30.32 26.44 27.23
3 gene164 38.52 6.48 0
2 gene165 37.62 31.05 26.42
2 gene166 30.37 26.27 15.71
1 gene167 66.06 34.07 24.77
1 gene168 53.57 42.53 23.68
4 gene169 22.12 13.22 20.16
2 gene170 27.64 20.14 23.89
2 gene171 27.52 25.11 23.6
2 gene172 37.46 25.76 12.07
3 gene173 32.23 8.13 0.29
2 gene174 35.44 23.14 15
3 gene175 38.57 8.92 3.95
6 gene176 0 6.28 0
2 gene177 33.97 23.34 16.22
2 gene178 36.44 26.36 19.25
3 gene179 35 9.45 0
2 gene180 37.5 26.05 24.92
2 gene181 32.18 25.3 7.83
4 gene182 13.16 14.22 22.57
2 gene183 33.2 27.81 23.36
6 gene184 0 8.38 19.26
2 gene185 31.82 25.53 21.7
2 gene186 36.63 30.74 35.42
2 gene187 35.03 21.61 12.25
3 gene188 31.74 15.85 0
6 gene189 0 9.17 0
6 gene190 0 11.41 0
2 gene191 38.49 32.36 28
2 gene192 34.01 30.83 32.88
3 gene193 36.67 5.45 0
2 gene194 29.19 18.59 6.09

Thanks,
Qiwei

errors in plotting the tree (again...)

Hello, hope you are having a relaxing holiday.

I am trying to run your software and encountered a following error (embarassingly the same error that I have reported before) even after following your instruction. Would you be able to look into this?

input file

> x = read.table('loci.tsv_rearranged_v2.txt', header=T, stringsAsFactors=F, sep='\t')
> vaf.col.names = grep('.vaf', colnames(x), value=T)
> colnames(x) = gsub('.vaf', '', colnames(x))
> vaf.col.names = gsub('.vaf', '', vaf.col.names)
> x$cluster[x$cluster == 0] = max(x$cluster) + 1
> clone.colors = NULL
> pdf('box_v2.pdf', width = 10, height = 10, useDingbats = FALSE, title='')
> pp <- plot.variant.clusters(x,
     cluster.col.name = 'cluster',
     show.cluster.size = FALSE,
     cluster.size.text.color = 'blue',
     vaf.col.names = vaf.col.names,
     vaf.limits = 70,
     sample.title.size = 20,
     violin = FALSE,
     box = FALSE,
     jitter = TRUE,
     jitter.shape = 1,
     jitter.color = clone.colors,
     jitter.size = 1,
     jitter.alpha = 1,
     jitter.center.method = 'median',
     jitter.center.size = 1,
     jitter.center.color = 'darkgray',
     jitter.center.display.value = 'none',
     highlight = 'is.driver',
     highlight.shape = 21,
     highlight.color = 'blue',
     highlight.fill.color = 'green',
     highlight.note.col.name = 'gene',
     highlight.note.size = 2,
     order.by.total.vaf = FALSE)
> dev.off()
null device 
          1 
> 
> y = infer.clonal.models(variants = x,
     cluster.col.name = 'cluster',
     vaf.col.names = vaf.col.names,
     sample.groups = NULL,
     cancer.initiation.model='monoclonal',
     subclonal.test = 'bootstrap',
     subclonal.test.model = 'non-parametric',
     num.boots = 1000,
     founding.cluster = 5,
     cluster.center = 'mean',
     ignore.clusters = NULL,
     clone.colors = NULL,
     min.cluster.vaf = 0.01,
     sum.p = 0.05,
     alpha = 0.05)
Sample 1: LT <-- LT
Sample 2: LT.PRE <-- LT.PRE
Sample 3: ST <-- ST
Using monoclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
LT : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 3,1 
LT : 2 clonal architecture model(s) found

LT.PRE : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 4,3,6 
LT.PRE : 1 clonal architecture model(s) found

ST : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 4,1 
ST : 2 clonal architecture model(s) found

Finding consensus models across samples...
Found  4 consensus model(s)
Generating consensus clonal evolution trees across samples...
Found 4 consensus model(s)
Scoring models...
Pruning consensus clonal evolution trees....
Seeding aware pruning is:  off
Number of unique pruned consensus trees: 1 
> y = convert.consensus.tree.clone.to.branch(y)
> 
> plot.clonal.models(y,
     box.plot = TRUE,
     fancy.boxplot = TRUE,
     fancy.variant.boxplot.highlight = 'is.driver',
     fancy.variant.boxplot.highlight.shape = 21,
     fancy.variant.boxplot.highlight.fill.color = 'red',
     fancy.variant.boxplot.highlight.color = 'black',
     fancy.variant.boxplot.highlight.note.col.name = 'gene',
     fancy.variant.boxplot.highlight.note.color = 'blue',
     fancy.variant.boxplot.highlight.note.size = 2,
     fancy.variant.boxplot.jitter.alpha = 1,
     fancy.variant.boxplot.jitter.center.color = 'grey50',
     fancy.variant.boxplot.base_size = 12,
     fancy.variant.boxplot.plot.margin = 1,
     fancy.variant.boxplot.vaf.suffix = '.VAF',
     clone.shape = 'bell',
     bell.event = TRUE,
     bell.event.label.color = 'blue',
     bell.event.label.angle = 60,
     clone.time.step.scale = 1,
     bell.curve.step = 2,
     merged.tree.plot = TRUE,
     tree.node.label.split.character = NULL,
     tree.node.shape = 'circle',
     tree.node.size = 30,
     tree.node.text.size = 0.5,
     merged.tree.node.size.scale = 1.25,
     merged.tree.node.text.size.scale = 2.5,
     merged.tree.cell.frac.ci = FALSE,
     merged.tree.clone.as.branch = TRUE,
     mtcab.event.sep.char = ',',
     mtcab.branch.text.size = 1,
     mtcab.branch.width = 5,
     mtcab.node.size = 3,
     mtcab.node.label.size = 1,
     mtcab.node.text.size = 1.5,
     cell.plot = TRUE, num.cells = 100,
     cell.border.size = 0.25,
     cell.border.color = 'black',
     clone.grouping = 'horizontal',
     scale.monoclonal.cell.frac = TRUE,
     show.score = FALSE,
     cell.frac.ci = TRUE,
     disable.cell.frac = FALSE,
     out.dir = 'output', out.format = 'pdf',
     overwrite.output = TRUE,
     width = 11, height = 7,
 panel.widths = c(3,4,2,6,6))
Error in x0[i] <- x1[which(x$branches == parent)] : 
  replacement has length zero

[Help..] Found 0 consensus model(s)

Hi, Dang, H. X.,

I'm trying to use Clonevol, but I cannot get consensus model when I tried to run "_infer.clonal.model_s".
In this example, I have 6 samples (3 primary, 3 metastasis), and I ran Pyclone and VAF was calculated as
cellular_prevalence *100/2, as recommended by tutorial.
I filtered out clones that had <5 variants, and I used below commend for infer.clonal.models.
I reviewed every issues in here, but did not figure out why clonevol did not infer consensus models for this case.
I tried tuning every possible options, but I failed..
To be honest, I tried some of other samples (5~8 samples per case), but I did not success to infer consensus clonal model.
could you help to figure it out?
I attached the data file, code I used, and results of it. Also variant_clone plot is attached.

Thank you,

Young

====code=======

y = infer.clonal.models(variants = clonevol_012,
                        cluster.col.name = 'cluster',
                        vaf.col.names = vaf.col.names, ##for CCF, ccf.col.names could be used
                        sample.groups = NULL,
                        cancer.initiation.model='monoclonal',  ##could select "polyclonal"
                        subclonal.test = 'bootstrap',
                        subclonal.test.model = 'non-parametric', ## could be "beta-binomial"non-parametric
                        num.boots = 1000,
                        founding.cluster = 1, ## clusters of variants that are found in all samples,  drivers/founder clusters
                        cluster.center = 'mean',
                        ignore.clusters = NULL,
                        clone.colors = clone.colors,
                        ##seeding.aware.tree.pruning = TRUE, merge.similar.samples = TRUE,
                        min.cluster.vaf = 0.01,
                        score.model.by="prabability",
                        # min probability that CCF(clone) is non-negative
                        sum.p = 0.01,
                        # alpha level in confidence interval estimate for CCF(clone)
                        alpha=0.01)

======results===

Sample 1: M <-- M
Sample 2: Sm <-- Sm
Sample 3: S <-- S
Sample 4: Ov1 <-- Ov1
Sample 5: Ov2 <-- Ov2
Sample 6: Per <-- Per
Using monoclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
M : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters:  
M : 40 clonal architecture model(s) found

Sm : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters:  
Sm : 32 clonal architecture model(s) found

S : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters:  
S : 7 clonal architecture model(s) found

Ov1 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 3 
Ov1 : 3 clonal architecture model(s) found

Ov2 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters:  
Ov2 : 16 clonal architecture model(s) found

Per : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters:  
Per : 3 clonal architecture model(s) found

Finding consensus models across samples...
Found  0 consensus model(s)
Found 0 consensus model(s)
Scoring models...
Pruning consensus clonal evolution trees....
Seeding aware pruning is:  off
Number of unique pruned consensus trees: 0

GCMET_012_clonevol.txt
012.pdf

ERROR: No clonal models for sample using clonevol

Hello,
I get the following error when running infer.clonal.models().

ERROR: No clonal models for sample: .
Check data or remove this sample, then re-run.

I have got more SNVs using about 430 or 1500 SNVs as the input data.But the error stills appears.

And CNVs infect the VAF (about 7/40 SNVs with VAF closed to 1). Then how to solve this problem? I cannot remove the 7 SNVs for their importance.
But I have input the CNV results when running sciClone.

I don't know how to choose the clonal model. So I tried both of them(polyclonal and monoclonal).
But the error still appear.

I am very confused.
And hope for your reply and help.
Thank you.
Gao

question: specifying time-points & working with low VAFs

Hi. I'm trying to do a sciclone-clonevol-fishplot workflow for deriving the clonal evolution of my samples. I have a case study of 4 WES samples: a primary tumor (extracted at time point 1) and three different regional recurrences (extracted at time point 2). My wxs varies from 10X to 30X depth, and my VAFs top out at around 30. They are all from the same tissue organ. I ran sciclone on my samples, and when I feed the results of it into clonevol, the resulting model seems to assume the samples are all at the same time-point. How can I specify the known time-points per sample so that clonevol can take that into account when inferring the model? Is that possible? Further, what do you reccommend in cases like this with low purity tumors?

question about enumerating clonal architectures

Hello,

I got some problems when running clonevol with 6 samples which have been divided into 11 clusters. It dose not report any errors but keep stayings in the state as follows more than 24h. I doubt weather there exits a limit to the number of clusters for clonevol.

There's no any progress but the following information:

There were 14 warnings (use warnings() to see them)
null device
1
There were 14 warnings (use warnings() to see them)
null device
1
null device
1
Sample 1: TD25 <-- TD25
Sample 2: TD26 <-- TD26
Sample 3: TD27 <-- TD27
Sample 4: TD28 <-- TD28
Sample 5: TD29 <-- TD29
Sample 6: TD30 <-- TD30
Using monoclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
TD25 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0
Non-positive VAF clusters

The code and input file are as follows:

x=read.table("/data/quantumclone/sci_clonevol_140_0123_no123.tsv", header=TRUE, sep="\t")
library(clonevol)

preparation

shorten vaf column names as they will be

vaf.col.names <- grep('.vaf', colnames(x), value=TRUE)
sample.names <- gsub('.vaf', '', vaf.col.names)
x[, sample.names] <- x[, vaf.col.names]
vaf.col.names <- sample.names

prepare sample grouping

sample.groups <- c('1', '2', '3', '4', '5', '6');
names(sample.groups) <- vaf.col.names

setup the order of clusters to display in various plots (later)

x <- x[order(x$cluster),]
clone.colors <- c('#999793', '#8d4891', '#f8e356', '#fe9536', '#d7352e', "#FF3030", '#FFE4E1', '#2F4F4F', '#191970', '#6495ED', '#7FFF00' )
#clone.colors <- NULL

pdf('box.pdf', width = 3, height = 3, useDingbats = FALSE, title='')
pp <- plot.variant.clusters(x,
cluster.col.name = 'cluster',
show.cluster.size = FALSE,
cluster.size.text.color = 'blue',
vaf.col.names = vaf.col.names,
vaf.limits = 70,
sample.title.size = 20,
violin = FALSE,
box = FALSE,
jitter = TRUE,
jitter.shape = 1,
jitter.color = clone.colors,
jitter.size = 3,
jitter.alpha = 1,
jitter.center.method = 'median',
jitter.center.size = 1,
jitter.center.color = 'darkgray',
jitter.center.display.value = 'none',
highlight = 'is.driver',
highlight.shape = 21,
highlight.color = 'blue',
highlight.fill.color = 'green',
highlight.note.col.name = 'gene',
highlight.note.size = 2,
order.by.total.vaf = FALSE)
dev.off()

plot clusters pairwise-ly

plot.pairwise(x, col.names = vaf.col.names,
out.prefix = 'variants.pairwise.plot',
colors = clone.colors)

plot mean/median of clusters across samples (cluster flow)

pdf('flow.pdf', width=3, height=3, useDingbats=FALSE, title='')
plot.cluster.flow(x, vaf.col.names = vaf.col.names,
sample.names = c('1', '2', '3', '4', '5', '6'),
colors = clone.colors)
dev.off()

infer consensus clonal evolution trees

y = infer.clonal.models(variants = x,
cluster.col.name = 'cluster',
vaf.col.names = vaf.col.names,
sample.groups = sample.groups,
cancer.initiation.model='monoclonal',
subclonal.test = 'bootstrap',
subclonal.test.model = 'non-parametric',
num.boots = 1000,
founding.cluster = 1,
cluster.center = 'mean',
ignore.clusters = NULL,
clone.colors = clone.colors,
min.cluster.vaf = 0.00,
# min probability that CCF(clone) is non-negative
sum.p = 0.05,
# alpha level in confidence interval estimate for CCF(clone)
alpha = 0.05)

map driver events onto the trees

y <- transfer.events.to.consensus.trees(y,
x[x$is.driver,],
cluster.col.name = 'cluster',
event.col.name = 'gene')

prepare branch-based trees

y <- convert.consensus.tree.clone.to.branch(y, branch.scale = 'sqrt')

plot variant clusters, bell plots, cell populations, and trees

plot.clonal.models(y,
# box plot parameters
box.plot = TRUE,
fancy.boxplot = TRUE,
fancy.variant.boxplot.highlight = 'is.driver',
fancy.variant.boxplot.highlight.shape = 21,
fancy.variant.boxplot.highlight.fill.color = 'red',
fancy.variant.boxplot.highlight.color = 'black',
fancy.variant.boxplot.highlight.note.col.name = 'gene',
fancy.variant.boxplot.highlight.note.color = 'blue',
fancy.variant.boxplot.highlight.note.size = 2,
fancy.variant.boxplot.jitter.alpha = 1,
fancy.variant.boxplot.jitter.center.color = 'grey50',
fancy.variant.boxplot.base_size = 12,
fancy.variant.boxplot.plot.margin = 1,
fancy.variant.boxplot.vaf.suffix = '.VAF',
# bell plot parameters
clone.shape = 'bell',
bell.event = TRUE,
bell.event.label.color = 'blue',
bell.event.label.angle = 60,
clone.time.step.scale = 1,
bell.curve.step = 2,
# node-based consensus tree parameters
merged.tree.plot = TRUE,
tree.node.label.split.character = NULL,
tree.node.shape = 'circle',
tree.node.size = 30,
tree.node.text.size = 0.5,
merged.tree.node.size.scale = 1.25,
merged.tree.node.text.size.scale = 2.5,
merged.tree.cell.frac.ci = FALSE,
# branch-based consensus tree parameters
merged.tree.clone.as.branch = TRUE,
mtcab.event.sep.char = ',',
mtcab.branch.text.size = 1,
mtcab.branch.width = 0.75,
mtcab.node.size = 3,
mtcab.node.label.size = 1,
mtcab.node.text.size = 1.5,
# cellular population parameters
cell.plot = TRUE,
num.cells = 100,
cell.border.size = 0.25,
cell.border.color = 'black',
clone.grouping = 'horizontal',
#meta-parameters
scale.monoclonal.cell.frac = TRUE,
show.score = FALSE,
cell.frac.ci = TRUE,
disable.cell.frac = FALSE,
# output figure parameters
out.dir = 'output',
out.format = 'pdf',
overwrite.output = TRUE,
width = 8,
height = 4,
# vector of width scales for each panel from left to right
panel.widths = c(3,4,2,4,2))

plot trees only

pdf('trees.pdf', width = 3, height = 5, useDingbats = FALSE)
plot.all.trees.clone.as.branch(y, branch.width = 0.5,
node.size = 1, node.label.size = 0.5)
dev.off()

Inputfiles:
sci_clonevol_140_0123_no123.txt

I look forward to your help!

what if you do not have driver event and gene information?

Hi,

For my data, I have only have CCF, VAF, cluster, and mutation location data. I do not have column "is.driver" or "gene". Therefore, I cannot map to driver events (transfer.events.to.consensus.trees) nor convert to branched-based trees (convert.consensus.tree.clone.to.branch).

Then, in step 5: Visualizing the results, what kind of "y" should I use to obtain results as the figure 3 in a tutorial? I tried to use y from infer.clonal.models but it does not give me the results as figure 3.

Thank you for your help!

ERROR: No clonal models for sample, if founding.cluster = NULL

Hi I am trying to run cloneEvol on sciClone results and I get the ERROR: No clonal models for sample, if founding.cluster = NULL. I have looked through the code of the ClonEvol.R file, an dI guess the problem might be that in the line 2211 of the file, founding.cluster is set from NULL to character(0) by founding.cluster = as.character(founding.cluster). It is then passed (as character(0) value) to the enumerate.clones function in the line 2292, which then corrupts the if(is.null(founding.cluster)) closure in line 350, which should be setting the cluster with the highest vaf as founding cluster.

plot.all.trees.clone.as.branch -bug with branches name

Hi, I find a bug within the plot.all.trees.clone.as.branch(). When the founding clone is not 1, then there is a bug when plot the tree.

My code is as following:

#The default founding clone is 7, rather than 1 
y = infer.clonal.models(variants = mutdata,
   cluster.col.name = cluster.col.name,
   ccf.col.names = vaf.col.names,
   cancer.initiation.model="monoclonal",
   subclonal.test = "bootstrap",
   subclonal.test.model = "non-parametric",
   num.boots = 2000,
   founding.cluster = 7)

plot.all.trees.clone.as.branch(y, branch.width = 0.5,
    node.size = 1, node.label.size = 0.5)

leading to this error.

Error in x0[i] <- x1[which(x$branches == parent)]

Solution is here:

The sub-fucntion germinate from plot.all.trees.clone.as.branch removed the "Y" branch with the code below. Your original code assumes the first branches is "Y", in some case, the "Y" brance is not the the first one.

#The first branch is not "Y"
y$matched$merged.trees[[1]]$branches
[1] "1"   "2"   "3"   "20"  "200" "Y"  

#your original code of germinate()
    x <- list(trunk.height=x$length[1],
              branches=x$branches[-1],
              lengths=x$lengths[-1],
              branch.colors=x$branch.colors[-1],
              branch.border.colors=x$branch.border.colors[-1],
              branch.border.linetypes=x$branch.border.linetypes[-1],
              branch.border.widths=x$branch.border.widths[-1],
              node.colors=x$node.colors[-1],
              node.border.colors=x$node.border.colors[-1],
              node.border.widths=x$node.border.widths[-1],
              node.labels=x$node.labels[-1],
              node.texts=x$node.texts[-1],
              branch.texts=x$branch.texts[-1])

The follwing code will solve this error.

 #update code in germinate()   
id = which(x$branches == "Y")

    x <- list(trunk.height=x$length[1],
              branches=x$branches[-id],
              lengths=x$lengths[-id],
              branch.colors=x$branch.colors[-id],
              branch.border.colors=x$branch.border.colors[-id],
              branch.border.linetypes=x$branch.border.linetypes[-id],
              branch.border.widths=x$branch.border.widths[-id],
              node.colors=x$node.colors[-id],
              node.border.colors=x$node.border.colors[-id],
              node.border.widths=x$node.border.widths[-id],
              node.labels=x$node.labels[-id],
              node.texts=x$node.texts[-id],
              branch.texts=x$branch.texts[-id])

Thanks for a very useful package!

Qingjian Chen

Sun Yat-Sen University Cancer Center (SYSUCC)

cancer.initiation.model parameter in infer.clonal.models()

Hi, thanks for the wonderful tool you developed!
Can you help me deal with a issue? while I was using your example data from aml1$variants and the code you provided in the manual, everything was fine except infer.clonal.models(), I set the "cancer.initiation.model" parameter to "polyclonal", and then an error occurred as follow:

Finding consensus models across samples...
Found 1 consensus model(s)
Generating consensus clonal evolution trees across samples...
Found 1 consensus model(s)
Scoring models...
Error in $<-.data.frame(*tmp*, "clone.ccf.combined.p", value = c(0, :
replacement has 6 rows, data has 5

I will be really appreciated if you can help me, thanks again!

tree plotting error

Hello:

I try to used the clonevol in my data (13 samples and 8 clone clusters). the infer.clonal.models run successfully without any warning or error.. But the plot.clonal.models and plot.all.trees.clone.as.branch have some error as the same following as the following:

plot.all.trees.clone.as.branch(y, branch.width = 0.5, node.size = 1, node.label.size = 0.5)
Error in x0[i] <- x1[which(x$branches == parent)] :
replacement has length zero

I try to get the branches information from 6 model results:

(y$matched$merged.trees[[1]])$branches
[1] "100" "1" "10" "Y" "2" "3" "4"
(y$matched$merged.trees[[2]])$branches
[1] "101" "1" "10" "Y" "2" "3" "102"
(y$matched$merged.trees[[3]])$branches
[1] "101" "1" "10" "Y" "2" "102" "3"
(y$matched$merged.trees[[4]])$branches
[1] "101" "1" "10" "Y" "102" "2" "3"
(y$matched$merged.trees[[5]])$branches
[1] "101" "1" "10" "Y" "102" "2" "103"
(y$matched$merged.trees[[6]])$branches
[1] "101" "1" "10" "Y" "102" "103" "2"

Can you let me know why the tree plot failed? Thanks.

Error in plot.clonal.models

I get the following output when running the plot.clonal.models pipeline:

Sample 1: Xvaf <-- Xvaf
Sample 2: dup <-- dup
Using monoclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
Xvaf : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters:
Xvaf : 1 clonal architecture model(s) found

dup : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters:
dup : 1 clonal architecture model(s) found

Finding consensus models across samples...
Found 1 consensus model(s)
Generating consensus clonal evolution trees across samples...
Found 1 consensus model(s)
Scoring models...
Pruning consensus clonal evolution trees....
Seeding aware pruning is: off
Number of unique pruned consensus trees: 1
Error in polygon(xx, yy, col = fill.color[i], border = border.color[i], :
invalid line type: must be length 2, 4, 6 or 8
In addition: Warning messages:
1: Removed 1 rows containing non-finite values (stat_summary).
2: Removed 1 rows containing missing values (geom_point).
3: Removed 1 rows containing non-finite values (stat_summary).
4: Removed 1 rows containing missing values (geom_point).

I use the following command:

plot.clonal.models(y,
box.plot = TRUE,
fancy.boxplot = TRUE,
fancy.variant.boxplot.jitter.alpha = 1,
fancy.variant.boxplot.jitter.center.color = "grey50",
fancy.variant.boxplot.base_size = 12,
fancy.variant.boxplot.plot.margin = 1,
fancy.variant.boxplot.vaf.suffix = ".VAF",
clone.shape = "bell",
bell.event = FALSE,
clone.time.step.scale = 1,
bell.curve.step = 2,
merged.tree.plot = TRUE,
tree.node.label.split.character = NULL,
tree.node.shape = "circle",
tree.node.size = 30,
tree.node.text.size = 0.5,
merged.tree.node.size.scale = 1.25,
merged.tree.node.text.size.scale = 2.5,
merged.tree.cell.frac.ci = FALSE,
merged.tree.clone.as.branch = TRUE,
mtcab.branch.text.size = 2,
mtcab.branch.width = 0.75,
mtcab.node.size = 3,
mtcab.node.label.size = 1,
mtcab.node.text.size = 1.5,
cell.plot = TRUE,
num.cells = 100,
cell.border.size = 0.25,
cell.border.color = "black",
clone.grouping = "horizontal",
scale.monoclonal.cell.frac = TRUE,
show.score = FALSE,
cell.frac.ci = TRUE,
disable.cell.frac = FALSE,
out.dir = paste0("~/Documents/2019_09_06_Daniel/", filenames[i]),
out.format = "pdf",
overwrite.output = TRUE,
width = 8,
height = 4,
panel.widths = c(3, 4, 2, 4, 2))
dev.off()

versions:
clonEvol v0.99.11
R v. 3.6.1

cluster.counts

Hi there, can anyone help?
I have two things I'd like to add to my cluster.counts data file.

cluster number assigned by ClonEvol (1,2,3,4 etc) to data table cluster.counts - currently my cluster.counts data table only has the PyClone cluster ID.
Median CCF per sample (currently just median.ccf for each cluster)
[LM002_cluster.counts.xlsx]

Attached example of my cluster.counts file
(https://github.com/hdng/clonevol/files/2533239/LM002_cluster.counts.xlsx)

Here is my script:

pyclone.directory <- '/Users/amandafitzpatrick/Library/Mobile Documents/com~~apple~~CloudDocs/DOCUMENTS/E57 exome sequencing/2018-08-30_results_ascat_pyclone/pyclone'
output.directory <- '/Users/amandafitzpatrick/Library/Mobile Documents/com~~apple~~CloudDocs/DOCUMENTS/E57 exome sequencing/2018-08-30_results_ascat_pyclone'
sample.sheet.file <- 'sample_annotation.txt'

min.mutation.count <- 30
cancer.genes <- scan('/Users/amandafitzpatrick/Library/Mobile Documents/com~~apple~~CloudDocs/DOCUMENTS/E57 exome sequencing/2018-08-30_results/Exome Sequencing/COMBINED list Stratton plus Caldas.txt', what = character())
patient.id <- 'LM002'

loci.file <- file.path(pyclone.directory, patient.id, 'output', 'tables', 'annotated_loci.tsv')
loci <- read.table(loci.file, header = TRUE, sep = '\t', stringsAsFactors = FALSE)

sample.sheet <- read.table(sample.sheet.file, header = TRUE, sep = '\t', stringsAsFactors = FALSE)

clonevol.data <- loci %>%
mutate(
vaf = 100*cellular_prevalence/2,
is.driver = symbol %in% cancer.genes & 'exonic' == func & 'synonymous_SNV' != exonic_func
) %>%
select(mutation_id, cluster_id, sample_id, vaf, symbol, is.driver) %>%
spread(sample_id, vaf);

n.samples <- length( unique(loci$sample_id) )
if( 1 == n.samples ) stop('Need more than one sample for ClonEvol!')

cluster.counts <- loci %>%
group_by(cluster_id) %>%
summarize(
count = n()/n.samples,
min.ccf = min(cellular_prevalence),
median.ccf = median(cellular_prevalence),
mean.ccf = mean(cellular_prevalence)
) %>%
ungroup() %>%
filter(count >= min.mutation.count) %>%
arrange(-median.ccf)

recode.values <- 1:nrow(cluster.counts)
names(recode.values) <- as.character(cluster.counts$cluster_id)

clonevol.data <- clonevol.data %>%
select(-mutation_id) %>%
filter(cluster_id %in% cluster.counts$cluster_id) %>%
mutate(cluster = recode.values[ as.character(cluster_id) ] )

Patients with only one sample

In the vignette, it says "ClonEvol requires an input data frame consisting of at least a cluster column and one or more variant cellular prevalence columns, each corresponds to a sample." However, when I try to run the functions for a patient with only one sample (so the data frame only has two columns, one for the cluster number and one for the VAF value), I run into errors that appear to be unresolvable.

too many models across samples

Hi
I am using clonevol to analyze the output from pyclone. It is a WES data, and the read depth is not deep enough, so I filtered all of the clusters with <10 mutations.
However, when I run the clonevol, it infers too many models across samples and I do not know how to choose. It says there are 3123 models, is there always such many models inferred for your data? And what is the meaning of 5 unique trees? I just want to plot a most-likely-phylogenic tree.
Thanks. Attached is my pairwise figure. Below is my output.
Thanks!

Sample 1: HCC772_1_1 <-- HCC772_1_1
Sample 2: HCC772_1_3 <-- HCC772_1_3
Sample 3: HCC772_2_1 <-- HCC772_2_1
Sample 4: HCC772_2_2 <-- HCC772_2_2
Sample 5: HCC772_2_3 <-- HCC772_2_3
Sample 6: HCC772_3_1 <-- HCC772_3_1
Sample 7: HCC772_3_2 <-- HCC772_3_2
Using monoclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
HCC772_1_1 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.05
Non-positive VAF clusters: 9,10,5,4,6,7
HCC772_1_1 : 26 clonal architecture model(s) found

HCC772_1_3 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.05
Non-positive VAF clusters: 9,8,5,6,4,7
HCC772_1_3 : 27 clonal architecture model(s) found

HCC772_2_1 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.05
Non-positive VAF clusters: 7,6,4,5,10,8
HCC772_2_1 : 23 clonal architecture model(s) found

HCC772_2_2 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.05
Non-positive VAF clusters: 6,8,10,5,7,9
HCC772_2_2 : 31 clonal architecture model(s) found

HCC772_2_3 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.05
Non-positive VAF clusters: 8,6,10,4,5,9
HCC772_2_3 : 26 clonal architecture model(s) found

HCC772_3_1 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.05
Non-positive VAF clusters: 7,4,10,8,5,9
HCC772_3_1 : 28 clonal architecture model(s) found

HCC772_3_2 : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.05
Non-positive VAF clusters: 7,4,10,9,6,8
HCC772_3_2 : 17 clonal architecture model(s) found

Finding matched clonal architecture models across samples...
Found 3123 compatible model(s)
Merging clonal evolution trees across samples...
Found 3123 compatible evolution models
Pruning merged clonal evolution trees....
Number of unique pruned trees: 5
Scoring models...
2823 model(s) with p-value <= 0.01_
variants.pairwise.plot.scatter.1-page.pdf

Error in merge.clone.trees

Hi,

First of all, thanks for distributing the software.

I generated the input for clonevol using pyclone-vi and ran the 'infer tree' process of the clonevol, but the following error message occurred.

# infer tree
> y = infer.clonal.models(variants = input,
                         cluster.col.name = 'cluster',
                         vaf.col.names = vaf.col.names,
                         sample.groups = NULL,
                         cancer.initiation.model='monoclonal',
                         subclonal.test = 'bootstrap',
                         subclonal.test.model = 'non-parametric',
                         num.boots = 1000,
                         founding.cluster = 1,
                         cluster.center = 'mean',
                         ignore.clusters = NULL,
                         clone.colors = NULL,
                         min.cluster.vaf = 0.01,
                         sum.p = 0.05,
                         alpha = 0.05)
Sample 1: ADK <-- ADK
Sample 2: NET <-- NET
Using monoclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
ADK : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 1,4,5,3,2 
ADK : 1 clonal architecture model(s) found

NET : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters: 4,2,1,3,5 
NET : 1 clonal architecture model(s) found

Finding consensus models across samples...
Found  1 consensus model(s)
Generating consensus clonal evolution trees across samples...
Error in merge.clone.trees(m, samples = samples, sample.groups, merge.similar.samples = merge.similar.samples) : 
  ERROR: Something wrong. No clones left after filter. They might have been excluded.

I had just two samples (ADK and NET) and I attached input file for clonevol.
20220207_clonevol_input_github.txt

Please let me know why this error occurred.

Thanks!

tree plot error2

Hello:

I am using the plot.tree.clone.as.branch function to generate the tree. But i found there is no tree generated if the branches like this:

tree_data %>% select(lab,parent,branches)
lab parent branches
1 1 13 021
2 2 14 0221
3 3 14 0222
10 10 -1 Y
5 5 16 01221
6 6 16 01222
7 7 15 0121
8 8 11 0111
9 9 11 0112
11 11 12 011
12 12 17 01
13 13 17 02
14 14 13 022
15 15 12 012
16 16 15 0122
17 17 10 0

plot.tree.clone.as.branch(tree_data,angle = 30,node.size =0,branch.text.size = 0.5,text.angle = 0,node.text.size = 1,branch.width=0.5)
Error in x0[i] <- x1[which(x$branches == parent)] :
replacement has length zero

Could you let me know how to fix this? ??

Here is the data frame of tree_data.
tree_data.RData.zip

How to use option ignore.clusters in infer.clonal.models

Hi,
I have a question on how to set the parameter 'ignore.clusters' for the infer.clonal.models function.
Firstly, I set it to a cluster number, eg: 3. But it doesn't work.
Then I define a vector, eg: IGNORE=c(3,4), and used in the function with ignore.clusters=IGNORE, it report error too.

Thanks for giving some advice.

Best,
Xiong

PyClone-vi output

Dear Dr Dang,

The new version of PyClone (VI) gives the same CCF value in all variants sharing the same cluster per sample. Quoting the author of PyClone: "This is expected. The CCF quoted is the mean value of the cluster the mutation is assigned to. This differs from PyClone where we compute the mean value of the CCF across the MCMC samples. The latter better represents uncertainty over clustering, but I suspect it makes little difference in practice." This makes the box plots flat looking, since there's no variation inside clusters per sample (PDF attached). I believe this is not an issue for ClonEvol since the "infer.clonal.models" function takes precisely the mean value. I just want to be sure of it.

Thank you very much for your time and for this amazing tool.
Box_4Mets.pdf

questions on the tree

Hi Ha,

I am running 3 samples on clonevol, they are primary, metastasis and relapse. In the analysis, considering of the copy number aberrations, we used pyclone to do cluster.

I have few questions on the plotted tree.

4191 sample is the relapse, I don't know why this sample is in the ancestor? Does primary should be in the ancestor?
how to set the large distance between the tree branches? they are so close to each other, sometimes the label is hard to tell.
what does the branch color represent?

one question on clonal mutations. Based on the variant clusters on CCF, cluster 1,2 and 3 in primary shown almost 100% in CCF, but in 4191 (relapse), the mutations in cluster 2 and 3 showed almost 0. I just don't understand if we say the mutations in cluster 1,2,3 are clonal mutations, then CCF in cluster 2 and 3 in 4191 should not be zero or very low percentage. If we say the mutations in cluster 2 and 3 are not clonal mutations, the CCF should be as high as 1 in both clusters in primary. I also found you had the similar results in your paper.

Looking forward to hearing from you.

Thanks,
Emily

pyclone cellular_prevalence or just VAF for clonevol

Hi,
I have finished pyclone, and want to use clonevol for downstream analysis.
Should I just the cellular_prevalence or just VAF for clonevol?

Thanks!
Tommy

Error : “attempt to select less than one element in get1index”

I'm a beginner in R and I'm trying to use the package ClonEvol, however the documentation on the github webpage is very limited. So for now I'm using their example code and trying to adapt it to my data called ce.

ce <- data.frame(
 cluster = c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6,7,7,7,7),
 gene = c("geneA","geneB","geneC","geneD","geneA","geneB","geneC","geneD","geneA","geneB","geneC","geneD","geneA","geneB","geneC",
"geneD","geneA","geneB","geneC","geneD","geneA","geneB","geneC","geneD","geneA","geneB","geneC","geneD"),
 prim.vaf = c(0.5,0,0,0,0.5,0.5,0,0,1,0.5,0,0,1,0.5,0,0.5,0.5,0.5,0,0.5,0.5,0.5,0,1,0.5,0.5,0.5,0)
       )

   cluster <- ce$cluster
   gene <- ce$gene
   prim.vaf <- ce$prim.vaf

   x <- ce

   vaf.col.names <- grep('prim.vaf', colnames(x), value=T)
   sample.names <- gsub('prim.vaf', '', vaf.col.names)
   x[, sample.names] <- x[, vaf.col.names]
   vaf.col.names <- sample.names
   sample.groups <- c('P', 'R');
   names(sample.groups) <- vaf.col.names
   x <- x[order(x$cluster),]

   pdf('box.pdf', width = 3, height = 5, useDingbats = FALSE, title='')
   pp <- variant.box.plot(x,
  cluster.col.name = ce$cluster,
  show.cluster.size = FALSE,
  cluster.size.text.color = 'blue',
  vaf.col.names = vaf.col.names,
  vaf.limits = 70,
  sample.title.size = 20,
  violin = FALSE,
  box = FALSE,
  jitter = TRUE,
  jitter.shape = 1,
  jitter.color = clone.colors,
  jitter.size = 3,
  jitter.alpha = 1,
  jitter.center.method = 'median',
  jitter.center.size = 1,
  jitter.center.color = 'darkgray',
  jitter.center.display.value = 'none',
  highlight = 'is.driver',
  highlight.note.col.name = 'gene',
  highlight.note.size = 2,
  highlight.shape =16,
  order.by.total.vaf = FALSE
   )
   dev.off()

However, I get the following error :
Error in .subset2(x, i, exact = exact) : recursive indexing failed at level 2

And if I delete cluster.col.name=ce$cluster and vaf.col.names=vaf.col.names, the error becomes the following :

    Error in .subset2(x, i, exact = exact) : attempt to select less than one     
    element in get1index

Has someone any idea of what went wrong ?

Can I still use clonevol on my data

I have 50 patients who have responded to chemotherapy from which 17 samples are pre-treatment biopsies and 33 patients post-treatment resections. The pre and post samples are not matched. I want to identify resistant clones, can I still use your software?

Thank you

to repeat the example in the ClonEvol Tutorial by RGui4.0.2

First time User.
Step by step to run the code in Tutorial by Dr. Dang, but some errors always come up. Need help! Thanks in advance.

R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

library(clonevol)
library(devtools)
Loading required package: usethis

data(aml1)
x <- aml1$variants

#shorten vaf column names as they will be
vaf.col.names <- grep('.vaf', colnames(x), value=T)
sample.names <- gsub('.vaf', '', vaf.col.names)
x[, sample.names] <- x[, vaf.col.names]
vaf.col.names <- sample.names

prepare sample grouping

sample.groups <- c('P', 'R');
names(sample.groups) <- vaf.col.names

setup the order of clusters to display in various plots (later)

x <- x[order(x$cluster),]

clone.colors <- c('#999793', '#8d4891', '#f8e356', '#fe9536', '#d7352e')

pdf('box.pdf', width = 3, height = 3, useDingbats = FALSE, title='')
pp <- plot.variant.clusters(x,

cluster.col.name = 'cluster',
show.cluster.size = FALSE,
cluster.size.text.color = 'blue',
vaf.col.names = vaf.col.names,
vaf.limits = 70,
sample.title.size = 20,
violin = FALSE,
box = FALSE,
jitter = TRUE,
jitter.shape = 1,
jitter.color = clone.colors,
jitter.size = 3,
jitter.alpha = 1,
jitter.center.method = 'median',
jitter.center.size = 1,
jitter.center.color = 'darkgray',
jitter.center.display.value = 'none',
highlight = 'is.driver',
highlight.shape = 21,
highlight.color = 'blue',
highlight.fill.color = 'green',
highlight.note.col.name = 'gene',
highlight.note.size = 2,
order.by.total.vaf = FALSE)
Warning messages:
1: fun.y is deprecated. Use fun instead.
2: show_guide has been deprecated. Please use show.legend instead.
3: fun.y is deprecated. Use fun instead.
4: show_guide has been deprecated. Please use show.legend instead.

dev.off()
null device
1

build_tree/infer.clonal.models unused argument error

Hello,
It appears thatbuild_tree is calling infer.clonal.models with an argument called post.sample.col, however, this does not exist in infer.clonal.models, so it generates an error. When I run a copy of build_tree with this argument commented out, my model(s) are successfully generated.

Best,
Noushin

models = infer.clonal.models(
    variants = clonevol_in,
    cluster.col.name = "cluster",
    ccf.col.names = sample_ids,
    #         sample.groups = sample.groups,
    post.sample.col = post.sample.col, #### THIS ARG IS NOT DEFINED 
    cancer.initiation.model = clonal.model,
    subclonal.test = 'bootstrap',
    subclonal.test.model = 'non-parametric',
    num.boots = 1000,
    founding.cluster = 1,
    cluster.center = 'mean',
    ignore.clusters = NULL,
    clone.colors = clone.colors,
    min.cluster.vaf = min.cluster.vaf,
    # min probability that CCF(clone) is non-negative
    sum.p = sum.p,
    random.seed = rand.seed,
    # alpha level in confidence interval estimate for CCF(clone)
    alpha = 0.05)

combined p calculation number of rows do not agree

Error in infer.clonal.models function

Hello,

I get the following error when running infer.clonal.models(). Can you please help me fix this?

Finding consensus models across samples...
Found 1 consensus model(s)
Generating consensus clonal evolution trees across samples...
Error in aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...) :
no rows to aggregate

Here a sample of the data:
cluster gene Ascite11.vaf Spheres.vaf Xenograft_p2.vaf Xenograft_p3.vaf
1 1 GPR4 19.80 40.00 43.93 44.74
2 1 KRTAP24-1 33.33 22.58 41.28 48.48
3 1 MLANA 23.26 87.50 81.48 100.00
4 1 PDP2 37.50 27.27 87.23 100.00
5 1 PGAP1 0.00 44.44 58.82 64.29
6 1 SPIRE2 26.15 48.68 76.14 100.00
Xenograft_p5.vaf Xenograft_p6.vaf Ascite11 Spheres Xenograft_p2 Xenograft_p3
1 49.53 42.86 19.80 40.00 43.93 44.74
2 41.61 64.18 33.33 22.58 41.28 48.48
3 100.00 100.00 23.26 87.50 81.48 100.00
4 98.86 100.00 37.50 27.27 87.23 100.00
5 24.00 62.50 0.00 44.44 58.82 64.29
6 97.20 97.06 26.15 48.68 76.14 100.00
Xenograft_p5 Xenograft_p6
1 49.53 42.86
2 41.61 64.18
3 100.00 100.00
4 98.86 100.00
5 24.00 62.50
6 97.20 97.06

Here the code of infer.clonal.models I used:
y = infer.clonal.models(variants = data_asc11,
cluster.col.name = 'cluster',
vaf.col.names = vaf.col.names,
sample.groups = sample.groups,
cancer.initiation.model='monoclonal',
subclonal.test = 'bootstrap',
subclonal.test.model = 'non-parametric',
num.boots = 1000,
founding.cluster = 1,
cluster.center = 'mean',
ignore.clusters = NULL,
clone.colors = clone.colors,
min.cluster.vaf = 0.01,
sum.p = 0.05,
alpha = 0.05)

Thank you for your help!
Valentina

Merging clonal evolution trees across samples...error

Hello,
I am using to clonevol to visualize and am running into the following error.

Finding matched clonal architecture models across samples...
Found  1 compatible model(s)
Merging clonal evolution trees across samples...
Error in data.frame(lab = v$lab, sample.with.cell.frac.ci = paste0(ifelse(v$is.founder,  : 
  arguments imply differing number of rows: 0, 1

I have attached the file I am using to run clonevol and have read it in as 'G.data'

here is the command that is throwing the error:

vaf.col.names <- grep (".vaf",colnames(G.data),value=TRUE)

x <- infer.clonal.models(variants=G.data,
cluster.col.name="cluster",
vaf.col.names=vaf.col.names,
subclonal.test = "bootstrap",
subclonal.test.model = "non-parametric", 
cluster.center = "mean",
num.boots = 1000, 
founding.cluster = 1,
min.cluster.vaf = 0.01,
p.value.cutoff = 0.05,
alpha = 1,
random.seed = 60000)

Thanks

G.input_clusters.txt

How to set founding.cluster?

Hi,
I have several questions on how to set the parameters for the infer.clonal.models function.
First, the help page does not explain all the arguments.
e.g. how do I set the founding.cluster?
in the github page example, there is a cluster.center argument, but I do not see it in the help.

How many times of bootstrap usually is enough? default is 1000, and for around 300 mutations, it is taking several hours to run, and not finish yet..

Thanks for giving more information.

Best,
Tommy

variants from primary tumors and metastasis tumors without normal controls

Hello,
I have 10 pairs of samples. They're paired primary and metastasis tumors, but none of them have normal controls.
Can variants called by mutect2 or other snp callers be used by SciClone and ClonEvol ? The variants are just mixtures of germline and somatic mutations.
I just want to know about tumor clone and evolution.

Thank you very much.

Question about node label

Thank you for sharing such an amazing tool.
I'm studying on relation between cirrhosis (N) and cholangiocarcinoma (T).
I'm using clonevol for this reason.
However, many nodes were both N and T at the same time.
I want nodes that only N or T.
Why are these nodes both N and T?

Error in infer.clonal.models: No clonal models for sample

Dear Ha X. Dang,

Hello, I am trying to analyze clonal evolution using PyClone and ClonEvol.

I have two WES samples from one patient.

When I followed the manual, I could not infer clonal models.

Here is my final input file for ClonEvol (it is stored in pyCloneResultMeltDcastDf below).

clonevol_input.txt

This is the original outcome from PyClone

KRCMC01270.PyClone.loci_results.txt

Below is the code for utilizing ClonEvol
#########################################################################

library(data.table)
library(clonevol)
library(reshape2)
library(tidyr)

pyCloneResult <- fread(/Absolute path/KRCMC01270.PyClone.loci_results.txt")

#To change the data frame structure - [mutation_id - sample_id - cluster_id - cellular_prevalence - cellular_prevalence_std - variant_allele_frequency] -> [mutation_id - cluster_id - sample1.vaf - sample2.vaf - sample1.cellular_prevalence - sample2.cellular_prevalence - sample1.cellular_prevalence_std - sample2.cellular_prevalence_std]
#https://stackoverflow.com/questions/11608167/reshape-multiple-value-columns-to-wide-format

pyCloneResultMeltDf <- melt(pyCloneResultDf, id.vars=c("mutation_id", "cluster_id", "sample_id"))

pyCloneResultMeltDcastDf <- dcast(pyCloneResultMeltDf, cluster_id + mutation_id ~ sample_id + variable)

#We have to start cluster id from 1, thus adding +1 to each cluster id (based on the clonevol manual)

    pyCloneResultMeltDcastDf$cluster_id <- pyCloneResultMeltDcastDf$cluster_id + 1

#To shorten vaf column names: "_variant_allele_frequency" -> "_vaf", "_cellular_prevalence" -> "_ccf", "---sampld-WBC" -> ""
#https://stackoverflow.com/questions/28700987/data-table-setnames-combined-with-regex

    setnames(pyCloneResultMeltDcastDf, names(pyCloneResultMeltDcastDf), gsub("_variant_allele_frequency", "_vaf", names(pyCloneResultMeltDcastDf)))
    setnames(pyCloneResultMeltDcastDf, names(pyCloneResultMeltDcastDf), gsub("_cellular_prevalence", "_ccf", names(pyCloneResultMeltDcastDf)))

#To remove the normal information ([Tumor---Normal_vaf] -> [Tumor_vaf]
setnames(pyCloneResultMeltDcastDf, names(pyCloneResultMeltDcastDf), gsub("---\S+-\S+", "", names(pyCloneResultMeltDcastDf)))

#To change the - (minus) into _ (underbar)
setnames(pyCloneResultMeltDcastDf, names(pyCloneResultMeltDcastDf), gsub("-", "_", names(pyCloneResultMeltDcastDf)))

    vaf.col.names <- grep('_vaf', colnames(pyCloneResultMeltDcastDf), value=T)
    ccf.col.names <- grep('_ccf$', colnames(pyCloneResultMeltDcastDf), value=T)
    sample.names <- gsub('_vaf', '', vaf.col.names)

#We utilize sample names as vaf columns (multiply 100 to utilize %)

    pyCloneResultMeltDcastDf[, sample.names] <- pyCloneResultMeltDcastDf[, vaf.col.names] * 100
    vaf.col.names <- sample.names

#We multiply 100 to ccf column (from proportion to percentage)
pyCloneResultMeltDcastDf[, ccf.col.names] <- pyCloneResultMeltDcastDf[, ccf.col.names] * 100

    # prepare sample grouping
    #sample.groups <-sample.names
    sample.groups <- c("C", "M")
    names(sample.groups) <- sample.names

    # setup the order of clusters to display in various plots (later)
    pyCloneResultMeltDcastDf <- pyCloneResultMeltDcastDf[order(pyCloneResultMeltDcastDf$cluster_id),]

    # setup the order of clusters to display in various plots (later)
    pyCloneResultMeltDcastDf <- pyCloneResultMeltDcastDf[order(pyCloneResultMeltDcastDf$cluster_id),]

   # To make a column which is corresponding to is.driver -> utilize CGC (cancer gene census genes) as a driver gene

Load CGC genes

cgc.file <- file.path("/BiO/Share/Database/COSMIC/grch37/v90/cancer_gene_census.csv")
cgc.df = read.csv(cgc.file, as.is = T)
cgc.genes = unique(cgc.df$Gene.Symbol)

    pyCloneResultMeltDcastDf$CGC <- sapply(strsplit(pyCloneResultMeltDcastDf$mutation_id, "_"), function(x) x[1]) %in% cgc.genes

    #Choosing colors for the clones
    clone.colors <- NULL

#Visualizing the variant clusters
outputFile <- gsub(pattern="loci_results.txt", replacement="loci_results_jitter.pdf", x = pyCloneResult)

    pdf(outputFile, width = 3, height = 3, useDingbats = FALSE, title='')
    pp <- plot.variant.clusters(pyCloneResultMeltDcastDf,
                                cluster.col.name = 'cluster',
                                show.cluster.size = FALSE,
                                cluster.size.text.color = 'blue',
                                vaf.col.names = vaf.col.names,
                                vaf.limits = 70,
                                sample.title.size = 10,
                                violin = FALSE,
                                box = FALSE,
                                jitter = TRUE,
                                jitter.shape = 1,
                                jitter.color = clone.colors,
                                jitter.size = 2,
                                jitter.alpha = 1,
                                jitter.center.method = 'median',
                                jitter.center.size = 1,
                                jitter.center.color = 'darkgray',
                                jitter.center.display.value = 'none',
                                highlight = 'is.driver',
                                highlight.shape = 21,
                                highlight.color = 'blue',
                                highlight.fill.color = 'green',
                                highlight.note.col.name = 'mutatin_id',
                                highlight.note.size = 2,
                                order.by.total.vaf = FALSE)
    dev.off()

#>> Here is the result
KRCMC01270.PyClone.loci_results_jitter.pdf

    #Plotting mean/median of clusters across samples (cluster flow)
    plot.cluster.flow(pyCloneResultMeltDcastDf, vaf.col.names = vaf.col.names,
                      sample.names = sample.names,
                      colors = clone.colors)

Here is the result.

########################################################################
#Inferring clonal evolution trees
y = infer.clonal.models(variants = pyCloneResultMeltDcastDf,
cluster.col.name = 'cluster',
#vaf.col.names = vaf.col.names,
ccf.col.names = ccf.col.names,
sample.groups = sample.groups,
cancer.initiation.model='monoclonal',
subclonal.test = 'bootstrap',
subclonal.test.model = 'non-parametric',
num.boots = 1000,
founding.cluster = 1,
cluster.center = 'mean',
ignore.clusters = NULL,
clone.colors = clone.colors,
min.cluster.vaf = 0.01,
# min probability that CCF(clone) is non-negative
sum.p = 0.05,
# alpha level in confidence interval estimate for CCF(clone)
alpha = 0.05)

########################################################################
###Following is the error messages

Calculate VAF as CCF/2
Sample 1: KRCMC01270_T1_D_ccf <-- KRCMC01270_T1_D_ccf
Sample 2: KRCMC01270_T2_D_ccf <-- KRCMC01270_T2_D_ccf
Using monoclonal model
Note: all VAFs were divided by 100 to convert from percentage to proportion.
Generating non-parametric boostrap samples...
KRCMC01270_T1_D_ccf : Enumerating clonal architectures...
Determining if cluster VAF is significantly positive...
Exluding clusters whose VAF < min.cluster.vaf=0.01
Non-positive VAF clusters:
KRCMC01270_T1_D_ccf : 0 clonal architecture model(s) found

lab vaf color parent ancestors occupied free free.mean
4 4 0.4168754 #cab2d6 NA - 0 0.4168754 NA
5 5 0.3003359 #ff99ff NA - 0 0.3003359 NA
3 3 0.2887949 #b2df8a NA - 0 0.2887949 NA
9 9 0.2780810 #cf8d30 NA - 0 0.2780810 NA
6 6 0.2759430 #fdbf6f NA - 0 0.2759430 NA
2 2 0.2343575 #a6cee3 NA - 0 0.2343575 NA
8 8 0.2068802 #bbbb77 NA - 0 0.2068802 NA
7 7 0.1714719 #fb9a99 NA - 0 0.1714719 NA
1 1 0.1211232 #cccccc NA - 0 0.1211232 NA
free.lower free.upper free.confident.level free.confident.level.non.negative
4 NA NA NA NA
5 NA NA NA NA
3 NA NA NA NA
9 NA NA NA NA
6 NA NA NA NA
2 NA NA NA NA
8 NA NA NA NA
7 NA NA NA NA
1 NA NA NA NA
p.value num.subclones excluded
4 NA 0 FALSE
5 NA 0 FALSE
3 NA 0 FALSE
9 NA 0 FALSE
6 NA 0 FALSE
2 NA 0 FALSE
8 NA 0 FALSE
7 NA 0 FALSE
1 NA 0 FALSE
ERROR: No clonal models for sample: KRCMC01270_T1_D_ccf
Check data or remove this sample, then re-run.

Also check if founding.cluster was set correctly!

Could you give me any idea how to solve this problem?

I think PyClone result is not very good because most variants are in cluster 1

Thank you in advance for your time

Sincreley,

Seung-hoon

hdng / clonevol Goto Github PK

clonevol's People

Contributors

Stargazers

Watchers

Forkers

clonevol's Issues

preparation

shorten vaf column names as they will be

prepare sample grouping

setup the order of clusters to display in various plots (later)

plot clusters pairwise-ly

plot mean/median of clusters across samples (cluster flow)

infer consensus clonal evolution trees

map driver events onto the trees

prepare branch-based trees

plot variant clusters, bell plots, cell populations, and trees

plot trees only

prepare sample grouping

setup the order of clusters to display in various plots (later)

Load CGC genes

Recommend Projects

Recommend Topics

Recommend Org