calebclass / nanotube Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 2.0 5.02 MB

Easy NanoString data analysis

License: GNU General Public License v3.0

R 2.28% TeX 0.27% HTML 97.46%

nanotube's People

Contributors

Stargazers

Watchers

Forkers

clukan99 sarahcooper22

nanotube's Issues

pc.scalefactors are not identical to results from positiveQC

I am a bit confused about the postive control steps.

In the codes

dta <- processNanostringData(nsFiles       = path_data,
                             sampleTab     = path_meta,
                             normalization = "nSolver",
                             idCol         = "RCC_Name",
                             output.format = "list")

It gives warning that

Identified positive scale factor outside recommended range
                    (0.3-3).
Check samples prior to conducting analysis.

And indeed in the dta$pc.scalefactors, there are small values.

However, when I run the positiveQC, I got the different scale values.

posQC <- positiveQC(dta)

Actually I expect them to be identical. Any insights are appreciated. Thanks!

Update:
The R codes show that the scale factor in positiveQC are calculated on log scale, while scale factor in processNanostringData is calculated in raw scale? Regarding the recommended range (0.3-3), which approach is better?

Request: Allow for RCC file names as input to processNanostringData

processNanostringData doesn't currently allow for RCC file names to be supplied as input (it does file directories, txt files, csv files, etc). A simple edit to the function allows for this extra flexibility:

		colnames(dat$dict) <- gsub("\\.| ", "", colnames(dat$dict))
	}
	else if (file.extension %in% c(".RCC", ".rcc")) {
		fileNames <- nsFiles[ grep("\\.rcc$", tolower(nsFiles)) ]
		cat("\nReading in .RCC files......", file = logfile, 
			append = TRUE)
		dat <- read_merge_rcc(fileNames, includeQC, logfile)
	}
	else {
		if (file.extension %in% c(".zip", ".ZIP")) {

Pass additional parameters to `limma::lmFit` in `runLimmaAnalysis`

I've been doing some more complex experiment designs, using "block" and "correlation" arguments to limma::lmFit. This is easily accommodated by allowing the pass-through of optional arguments using ...:


runLimmaAnalysis <- function (dat, groups = NULL, base.group = NULL, design = NULL, ...)
#...
 	fit <- limma::lmFit(dat.limma, design=design, ...)
#...

Running limma analysis

Hi Caleb,

Im trying to run limma analysis on two conditions within a CSV, both with 3 repeats but am unsure on how to perform this. Im not sure how the data is laid out in the examples and was wondering if i can have some help.

this is my current code:

dat <- processNanostringData(nsFiles = "/Users/georgesmith/Documents/INS Year 4/OND2.csv",
                             replicateCol = c('ON_D2_1', 'ON_D2_2', 'ON_D2_3'), 
                             groupCol = "CodeClass",
                             idCol = "Name",
                             normalization = "nSolver",
                             housekeeping = c("PGK1", "OAZ1", "TBP", "POLR2A", 
                                              "ABCF1", "SDHA", "NRDE2", "PPIA",
                                              "UBB", "STK11IP","G6PD","TBC1D10B"),
                             bgType = "t.test", bgPVal = 0.01,
                             output.format = "ExpressionSet"
                             )

My data is laid out like this as a csv:

Any help would be massively appreciated thank you!

Cheers
George

Missing value where TRUE/FALSE needed

Hey there! I am trying to run Nanotube analysis on some nanostring data in a CSV format and am running into an issue. I'm pretty new to this so apologies if im being naive but when running this code:

dat <- processNanostringData(nsFiles = "/Users/georgesmith/Documents/INS Year 4/OND2.csv",
                             replicateCol = c('ON_D2_1', 'ON_D2_2', 'ON_D2_3'), 
                             normalization = "nSolver",
                             housekeeping = c("PGK1", "OAZ1", "TBP", "POLR2A", 
                                              "ABCF1", "SDHA", "NRDE2", "PPIA",
                                              "UBB", "STK11IP","G6PD","TBC1D10B"),
                             bgType = "t.test", bgPVal = 0.01,
                             output.format = "ExpressionSet"
                             )

i didn't use sample_data as when i tried to load my CSV file into it i'd get an empty value after. When I run this i get the error:

Loading count data......
Averaging technical replicates.....
Calculating positive scale factors......Error in if (sum(dat$dict$CodeClass == "Positive") == 0) { :
missing value where TRUE/FALSE needed

merging dat objects

Hi,

is there a ways to merge different dat objects, which were read in with processnanostringdata?

Cheers

Additional data format support

          HI @calebclass,

thanks for this nice response! This issue can be closed.
Just one suggestion for enhancement: Would it be possible to extend the input data format to include xls files, in addition to the current support for .txt and .csv? This could potentially streamline certain processes. Thank you!

Best,
T

Originally posted by @TdzBAS in #8 (comment)

reading in rcc files from multiple batches

Hi @calebclass,

I have three nanostring datasets, which were conducted with the same protocol. So I have different batches. should I read in each dataset separately or can I read the rcc files alltogether into the function "processnanostringdata" ? Which effect has the subsequent QC on this?
Reading in them separately would require to merge them afterwards. IMHO reading everything at once with only one metafile seems to be most convenient.. But dont know if this is the right way..

Best,
T

Request: option to keep control genes in 'runLimmaAnalysis'

First, thanks for making this package available!

For some QC applications (e.g. MA plots) and comparisons to other methods, it can be useful to see what happens to the control genes (positive, negative, housekeeping) during differential expression analysis. Currently these genes are removed in runLimmaAnalysis:

NanoTube/R/runLimmaAnalysis.R

Lines 51 to 54 in 026563d

 dat.limma <- dat[grep("endogenous", fData(dat)$CodeClass, ignore.case = TRUE),] 

 rownames(dat.limma) <- fData(dat)$Name[grep("endogenous", 

 fData(dat)$CodeClass, 

 ignore.case = TRUE)]

Would it be possible to add an option to runLimmaAnalysis to keep the control genes (and process them like the rest)?

calebclass / nanotube Goto Github PK

nanotube's People

Contributors

Stargazers

Watchers

Forkers

nanotube's Issues

pc.scalefactors are not identical to results from positiveQC

Request: Allow for RCC file names as input to processNanostringData

Pass additional parameters to `limma::lmFit` in `runLimmaAnalysis`

Running limma analysis

Missing value where TRUE/FALSE needed

merging dat objects

Additional data format support

reading in rcc files from multiple batches

Request: option to keep control genes in 'runLimmaAnalysis'

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	dat.limma <- dat[grep("endogenous", fData(dat)$CodeClass, ignore.case = TRUE),]
	rownames(dat.limma) <- fData(dat)$Name[grep("endogenous",
	fData(dat)$CodeClass,
	ignore.case = TRUE)]