Hi, i run bsmooth() separately but when I combine the

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

combine(sample1.smooth, sample2.smooth) gives in unsmooth results? about bsseq HOT 4 CLOSED

sahuno commented on July 18, 2024

combine(sample1.smooth, sample2.smooth) gives in unsmooth results?

from bsseq.

Comments (4)

PeteHaitch commented on July 18, 2024

Hi @sahuno,

It's not well-documented, but you can only combine 2 BSseq objects that have been smoothed if they have the exact same set of loci (i.e. having the same set of chromosomes isn't enough).
That's what the warning message Combining BSseq objects with different loci. You will need to re-smooth using 'BSmooth()' on the returned object. is telling you.

The reason for this is that downstream functionality assumes that all samples have the same been smoothed over the same set of loci.

So the solution is to first combine and then smooth.
The only way you can smooth and then combine is if all objects have the same loci.

from bsseq.

kasperdanielhansen commented on July 18, 2024

So if you want to smooth in separate processes, you should combine first and then split. When you combine samples, you typically have a set of CpGs which are "observed" in sample A but not in sample B, and this combination step adds those CpGs with a coverage of 0 to sample B.

…

On Sun, Jun 4, 2023 at 5:52 PM Peter Hickey ***@***.***> wrote: Hi @sahuno <https://github.com/sahuno>, It's not well-documented, but you can only combine 2 *BSseq* objects that have been smoothed if they have the exact same set of loci (i.e. having the same set of chromosomes isn't enough). That's what the warning message Combining BSseq objects with different loci. You will need to re-smooth using 'BSmooth()' on the returned object. is telling you. The reason for this is that downstream functionality assumes that all samples have the same been smoothed over the same set of loci. So the solution is to first combine and then smooth. The only way you can smooth and then combine is if all objects have the same loci. — Reply to this email directly, view it on GitHub <#122 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABF2DH7BKP4I2YMR56FM4UDXJT7SVANCNFSM6AAAAAAYZODWYA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

-- Best, Kasper

from bsseq.

sahuno commented on July 18, 2024

Hi @PeteHaitch @kasperdanielhansen thanks for the feedback!
@kasperdanielhansen i think i already did your suggestion in another approach.
a. Read both files with read.bismark()
b. then split and save individual samples as rds file
c. smooth separately
d. combine smoothed files

#read bisnark coverage files from disk and set annotations
bismark_bsseq <- read.bismark(files = sample_metadata$File,
                    colData = DataFrame(row.names = sample_metadata$Sample, Type = sample_metadata$Group, Pair = c("Pair1", "Pair1")),
                      rmZeroCov = FALSE,
                      strandCollapse = FALSE,
                      verbose = TRUE)

#split for parallelzation
## Split data and run bs
BS1 <- bismark_bsseq[, 1]
saveRDS(BS1, file =paste0(wrkDir,"/SA123T.rds"))

BS2 <- bismark_bsseq[, 2]
saveRDS(BS2, file =paste0(wrkDir,"/SA123N.rds"))

##run bsmooth separate on cluster using snakemake
bs_unsmoothed <- readRDS(opt$input_file)
bismark_bsseq.fit <- BSmooth(
    BSseq = bs_unsmoothed, 
    BPPARAM = MulticoreParam(workers = 12), 
    verbose = TRUE)

## load and combine; in a new R session
SA123N.fitted <- readRDS("/dir/scripts/bsmooth_snakemake/results/bsmooth_fit/SA123N/SA123N.fitted.rds")
SA123T.fitted <- readRDS("/dir/scripts/bsmooth_snakemake/results/bsmooth_fit/SA123T/SA123T.fitted.rds")
BS.cancer.ex.fit <- combine(SA123N.fitted, SA123T.fitted)

So my confusion is that does splitting bsseq object automatically remove CpGs with zero coverage in the result file or maybe the bsmoothing does?

from bsseq.

kasperdanielhansen commented on July 18, 2024

Superficiually glancing at your code, I believe it should work. You could try to look at the dimension (number of CpGs) in the different objects. The code following ##run bsmooth separate on cluster using snakemake makes it impossible to see what files are actually being loaded. But I would expect BS1 and BS2 to have the same number of CpGs.

…

On Mon, Jun 5, 2023 at 11:44 AM Samuel Terkper Ahuno < ***@***.***> wrote: Hi @PeteHaitch <https://github.com/PeteHaitch> @kasperdanielhansen <https://github.com/kasperdanielhansen> thanks for the feedback! @kasperdanielhansen <https://github.com/kasperdanielhansen> i think i already did your suggestion in another approach. a. Read both files with read.bismark() b. then split and save individual samples as rds file c. smooth separately d. combine smoothed files #read bisnark coverage files from disk and set annotations bismark_bsseq <- read.bismark(files = sample_metadata$File, colData = DataFrame(row.names = sample_metadata$Sample, Type = sample_metadata$Group, Pair = c("Pair1", "Pair1")), rmZeroCov = FALSE, strandCollapse = FALSE, verbose = TRUE) #split for parallelzation ## Split data and run bs BS1 <- bismark_bsseq[, 1] saveRDS(BS1, file =paste0(wrkDir,"/SA123T.rds")) BS2 <- bismark_bsseq[, 2] saveRDS(BS2, file =paste0(wrkDir,"/SA123N.rds")) ##run bsmooth separate on cluster using snakemake bs_unsmoothed <- readRDS(opt$input_file) bismark_bsseq.fit <- BSmooth( BSseq = bs_unsmoothed, BPPARAM = MulticoreParam(workers = 12), verbose = TRUE) ## load and combine; in a new R session SA123N.fitted <- readRDS("/dir/scripts/bsmooth_snakemake/results/bsmooth_fit/SA123N/SA123N.fitted.rds") SA123T.fitted <- readRDS("/dir/scripts/bsmooth_snakemake/results/bsmooth_fit/SA123T/SA123T.fitted.rds") BS.cancer.ex.fit <- combine(SA123N.fitted, SA123T.fitted) So my confusion is that does splitting bsseq object automatically remove CpGs with zero coverage in the result file or maybe the bsmoothing does? — Reply to this email directly, view it on GitHub <#122 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABF2DHZ3SJEXDGIH5R27NF3XJX5HPANCNFSM6AAAAAAYZODWYA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Best, Kasper

from bsseq.

combine(sample1.smooth, sample2.smooth) gives in unsmooth results? about bsseq HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent