Git Product home page Git Product logo

nanotube's People

Contributors

calebclass avatar jwokaty avatar nturaga avatar

Stargazers

 avatar

Watchers

 avatar

nanotube's Issues

pc.scalefactors are not identical to results from positiveQC

I am a bit confused about the postive control steps.

In the codes

dta <- processNanostringData(nsFiles       = path_data,
                             sampleTab     = path_meta,
                             normalization = "nSolver",
                             idCol         = "RCC_Name",
                             output.format = "list")

It gives warning that

Identified positive scale factor outside recommended range
                    (0.3-3).
Check samples prior to conducting analysis.

And indeed in the dta$pc.scalefactors, there are small values.

However, when I run the positiveQC, I got the different scale values.

posQC <- positiveQC(dta)

Actually I expect them to be identical. Any insights are appreciated. Thanks!

Update:
The R codes show that the scale factor in positiveQC are calculated on log scale, while scale factor in processNanostringData is calculated in raw scale? Regarding the recommended range (0.3-3), which approach is better?

Request: Allow for RCC file names as input to processNanostringData

processNanostringData doesn't currently allow for RCC file names to be supplied as input (it does file directories, txt files, csv files, etc). A simple edit to the function allows for this extra flexibility:

		colnames(dat$dict) <- gsub("\\.| ", "", colnames(dat$dict))
	}
	else if (file.extension %in% c(".RCC", ".rcc")) {
		fileNames <- nsFiles[ grep("\\.rcc$", tolower(nsFiles)) ]
		cat("\nReading in .RCC files......", file = logfile, 
			append = TRUE)
		dat <- read_merge_rcc(fileNames, includeQC, logfile)
	}
	else {
		if (file.extension %in% c(".zip", ".ZIP")) {

Pass additional parameters to `limma::lmFit` in `runLimmaAnalysis`

I've been doing some more complex experiment designs, using "block" and "correlation" arguments to limma::lmFit. This is easily accommodated by allowing the pass-through of optional arguments using ...:


runLimmaAnalysis <- function (dat, groups = NULL, base.group = NULL, design = NULL, ...)
#...
 	fit <- limma::lmFit(dat.limma, design=design, ...)
#...

Running limma analysis

Hi Caleb,

Im trying to run limma analysis on two conditions within a CSV, both with 3 repeats but am unsure on how to perform this. Im not sure how the data is laid out in the examples and was wondering if i can have some help.

this is my current code:

dat <- processNanostringData(nsFiles = "/Users/georgesmith/Documents/INS Year 4/OND2.csv",
                             replicateCol = c('ON_D2_1', 'ON_D2_2', 'ON_D2_3'), 
                             groupCol = "CodeClass",
                             idCol = "Name",
                             normalization = "nSolver",
                             housekeeping = c("PGK1", "OAZ1", "TBP", "POLR2A", 
                                              "ABCF1", "SDHA", "NRDE2", "PPIA",
                                              "UBB", "STK11IP","G6PD","TBC1D10B"),
                             bgType = "t.test", bgPVal = 0.01,
                             output.format = "ExpressionSet"
                             )

My data is laid out like this as a csv:

Codeclass | Accession | Name | Condition 1 Repeat 1 | Condition 1 Repeat 2 | Condition 1 Repeat 3 | Condition 2 Repeat 1 | Condition 2 Repeat 2 | Condition 2 Repeat 3

Any help would be massively appreciated thank you!

Cheers
George

Missing value where TRUE/FALSE needed

Hey there! I am trying to run Nanotube analysis on some nanostring data in a CSV format and am running into an issue. I'm pretty new to this so apologies if im being naive but when running this code:

dat <- processNanostringData(nsFiles = "/Users/georgesmith/Documents/INS Year 4/OND2.csv",
                             replicateCol = c('ON_D2_1', 'ON_D2_2', 'ON_D2_3'), 
                             normalization = "nSolver",
                             housekeeping = c("PGK1", "OAZ1", "TBP", "POLR2A", 
                                              "ABCF1", "SDHA", "NRDE2", "PPIA",
                                              "UBB", "STK11IP","G6PD","TBC1D10B"),
                             bgType = "t.test", bgPVal = 0.01,
                             output.format = "ExpressionSet"
                             )

i didn't use sample_data as when i tried to load my CSV file into it i'd get an empty value after. When I run this i get the error:

Loading count data......
Averaging technical replicates.....
Calculating positive scale factors......Error in if (sum(dat$dict$CodeClass == "Positive") == 0) { :
missing value where TRUE/FALSE needed

merging dat objects

Hi,

is there a ways to merge different dat objects, which were read in with processnanostringdata?

Cheers

Additional data format support

          HI @calebclass,

thanks for this nice response! This issue can be closed.
Just one suggestion for enhancement: Would it be possible to extend the input data format to include xls files, in addition to the current support for .txt and .csv? This could potentially streamline certain processes. Thank you!

Best,
T

Originally posted by @TdzBAS in #8 (comment)

reading in rcc files from multiple batches

Hi @calebclass,

I have three nanostring datasets, which were conducted with the same protocol. So I have different batches. should I read in each dataset separately or can I read the rcc files alltogether into the function "processnanostringdata" ? Which effect has the subsequent QC on this?
Reading in them separately would require to merge them afterwards. IMHO reading everything at once with only one metafile seems to be most convenient.. But dont know if this is the right way..

Best,
T

Request: option to keep control genes in 'runLimmaAnalysis'

First, thanks for making this package available!

For some QC applications (e.g. MA plots) and comparisons to other methods, it can be useful to see what happens to the control genes (positive, negative, housekeeping) during differential expression analysis. Currently these genes are removed in runLimmaAnalysis:

dat.limma <- dat[grep("endogenous", fData(dat)$CodeClass, ignore.case = TRUE),]
rownames(dat.limma) <- fData(dat)$Name[grep("endogenous",
fData(dat)$CodeClass,
ignore.case = TRUE)]

Would it be possible to add an option to runLimmaAnalysis to keep the control genes (and process them like the rest)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.