calebclass / nanotube Goto Github PK
View Code? Open in Web Editor NEWEasy NanoString data analysis
License: GNU General Public License v3.0
Easy NanoString data analysis
License: GNU General Public License v3.0
I am a bit confused about the postive control steps.
In the codes
dta <- processNanostringData(nsFiles = path_data,
sampleTab = path_meta,
normalization = "nSolver",
idCol = "RCC_Name",
output.format = "list")
It gives warning that
Identified positive scale factor outside recommended range
(0.3-3).
Check samples prior to conducting analysis.
And indeed in the dta$pc.scalefactors, there are small values.
However, when I run the positiveQC
, I got the different scale values.
posQC <- positiveQC(dta)
Actually I expect them to be identical. Any insights are appreciated. Thanks!
Update:
The R codes show that the scale factor in positiveQC are calculated on log scale, while scale factor in processNanostringData is calculated in raw scale? Regarding the recommended range (0.3-3), which approach is better?
processNanostringData
doesn't currently allow for RCC file names to be supplied as input (it does file directories, txt files, csv files, etc). A simple edit to the function allows for this extra flexibility:
colnames(dat$dict) <- gsub("\\.| ", "", colnames(dat$dict))
}
else if (file.extension %in% c(".RCC", ".rcc")) {
fileNames <- nsFiles[ grep("\\.rcc$", tolower(nsFiles)) ]
cat("\nReading in .RCC files......", file = logfile,
append = TRUE)
dat <- read_merge_rcc(fileNames, includeQC, logfile)
}
else {
if (file.extension %in% c(".zip", ".ZIP")) {
I've been doing some more complex experiment designs, using "block" and "correlation" arguments to limma::lmFit. This is easily accommodated by allowing the pass-through of optional arguments using ...
:
runLimmaAnalysis <- function (dat, groups = NULL, base.group = NULL, design = NULL, ...)
#...
fit <- limma::lmFit(dat.limma, design=design, ...)
#...
Hi Caleb,
Im trying to run limma analysis on two conditions within a CSV, both with 3 repeats but am unsure on how to perform this. Im not sure how the data is laid out in the examples and was wondering if i can have some help.
this is my current code:
dat <- processNanostringData(nsFiles = "/Users/georgesmith/Documents/INS Year 4/OND2.csv",
replicateCol = c('ON_D2_1', 'ON_D2_2', 'ON_D2_3'),
groupCol = "CodeClass",
idCol = "Name",
normalization = "nSolver",
housekeeping = c("PGK1", "OAZ1", "TBP", "POLR2A",
"ABCF1", "SDHA", "NRDE2", "PPIA",
"UBB", "STK11IP","G6PD","TBC1D10B"),
bgType = "t.test", bgPVal = 0.01,
output.format = "ExpressionSet"
)
My data is laid out like this as a csv:
Codeclass | Accession | Name | Condition 1 Repeat 1 | Condition 1 Repeat 2 | Condition 1 Repeat 3 | Condition 2 Repeat 1 | Condition 2 Repeat 2 | Condition 2 Repeat 3
Any help would be massively appreciated thank you!
Cheers
George
Hey there! I am trying to run Nanotube analysis on some nanostring data in a CSV format and am running into an issue. I'm pretty new to this so apologies if im being naive but when running this code:
dat <- processNanostringData(nsFiles = "/Users/georgesmith/Documents/INS Year 4/OND2.csv",
replicateCol = c('ON_D2_1', 'ON_D2_2', 'ON_D2_3'),
normalization = "nSolver",
housekeeping = c("PGK1", "OAZ1", "TBP", "POLR2A",
"ABCF1", "SDHA", "NRDE2", "PPIA",
"UBB", "STK11IP","G6PD","TBC1D10B"),
bgType = "t.test", bgPVal = 0.01,
output.format = "ExpressionSet"
)
i didn't use sample_data as when i tried to load my CSV file into it i'd get an empty value after. When I run this i get the error:
Loading count data......
Averaging technical replicates.....
Calculating positive scale factors......Error in if (sum(dat$dict$CodeClass == "Positive") == 0) { :
missing value where TRUE/FALSE needed
Hi,
is there a ways to merge different dat objects, which were read in with processnanostringdata?
Cheers
HI @calebclass,
thanks for this nice response! This issue can be closed.
Just one suggestion for enhancement: Would it be possible to extend the input data format to include xls files, in addition to the current support for .txt and .csv? This could potentially streamline certain processes. Thank you!
Best,
T
Originally posted by @TdzBAS in #8 (comment)
Hi @calebclass,
I have three nanostring datasets, which were conducted with the same protocol. So I have different batches. should I read in each dataset separately or can I read the rcc files alltogether into the function "processnanostringdata" ? Which effect has the subsequent QC on this?
Reading in them separately would require to merge them afterwards. IMHO reading everything at once with only one metafile seems to be most convenient.. But dont know if this is the right way..
Best,
T
First, thanks for making this package available!
For some QC applications (e.g. MA plots) and comparisons to other methods, it can be useful to see what happens to the control genes (positive, negative, housekeeping) during differential expression analysis. Currently these genes are removed in runLimmaAnalysis
:
Lines 51 to 54 in 026563d
Would it be possible to add an option to runLimmaAnalysis
to keep the control genes (and process them like the rest)?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.