ethanbass / chromconverter Goto Github PK
View Code? Open in Web Editor NEWParsers for chromatography data in R (HPLC-DAD/UV, GC-FID, MS)
Home Page: https://ethanbass.github.io/chromConverter/
License: GNU General Public License v3.0
Parsers for chromatography data in R (HPLC-DAD/UV, GC-FID, MS)
Home Page: https://ethanbass.github.io/chromConverter/
License: GNU General Public License v3.0
I am trying to convert a .cdf file from a Shimadzu machine to .mzml to use in GNPS, and when I try, it says that exporting to .mzml requires the openchrom parser, but when you use openchrom it says it doesn't recognize .cdf. Is there a way around this? From the documentation it seems like standard read_chroms parsers should recognize both .cdf and .mzml
error when I don't specify a parser:
The selected export format is currently only supported by openchrom
parsers.
error when I specify openchrom:
The ‘cdf’ format can be converted using the following parsers: ‘chromconverter’.
The ‘openchrom’ parser can take the following formats as inputs:
‘msd’, ‘csd’, ‘wsd’.
Here's the code I'm trying to use (without specifying a parser), if it's simply just something wrong with how it's written:
data <- read_chroms(paths = "mypath/filename.CDF", format_in = "cdf", export = TRUE, path_out = "mypath/filename.mzml", export_format = "mzml")
Would appreciate any help!!
Hi Ethan,
This is a reply to the issue that I opened previously (ethanbass/chromatographR#29).
When exporting from the Waters instrument via their software (Empower) there are two options: 1) ascii (the resulting files have the .arw extension) and 2) raw (the resulting files have the .cdf extension).
Hello, extreme noob to R here... Am trying to convert a csv HPLC chromatogram (exported from agilent chemstation) to .cdf format.
I'm getting:
read_chroms("test2.CSV", format_in = "chemstation_csv", find_files = FALSE, export=TRUE, export_format = "cdf")
Export directory not specified! Export files totemp
directory (y/n)?y
Error in if (is.null(lambda) && ncol(x) + as.numeric(attr(x, "data_format") == :
missing value where TRUE/FALSE needed
Whereas when export_format = "csv", the command succeeds.
It could be some extremely minor problem, but due to my inexperience I have no idea whether this is a real problem or not.
Any suggestions/pointers/advice would be greatly appreciated! thanks :)
OpenChrom includes a command line interface that can be used to call their file parsers: https://github.com/OpenChrom/openchrom/wiki/CLI.
R should be able to write an appropriate batch job file and feed it to OpenChrom with a system call to access their file converters. For example, they have parsers for several FID formats which don't seem to be available otherwise.
Alternatively, if someone who understands Java reads this and wants to figure out how to call the parsers directly from R (e.g. using rJava) I think that would be great as well.
Hello! Thank you for this excellent repository. I'm trying to use read_chemstation_ch.R to read in a v179 .ch GC-FID file. The retention times seem right, but the intensities are very different from what I see when I open the file in MassHunter. Could there be an issue with the scaling factor that is applied?
I've attached the .ch (FID1A.txt) file as well as an excel where I compared the MassHunter and read_chemstation_ch intensities. Any help is appreciated!
Dear Ethan,
I tried to use your code for reading the raw data of our Shimadzu HPLC, thanks for that code!
I am not a programmer and I am mainly in Python and not in R. Here are some results from our (mine and my colleaque using R) last days working on this, I wonder whether you would like to include the issues we found for your R code.
I needed to change mainly two things:
your line 147 in read_shimadzu_lcd.R, mat <- matrix(NA, nrow = fsize/(n_lambdas*1.5), ncol = n_lambdas)
This is about the size of the data stream which depends on the number of wavelength from the PDA and the total time of the HPLC run. A simple factor 1.5 does not work for my data. Instead, I first scan the PDA raw data stream for the start bits of each header of the data set and sum them up. Second, I now found the entry in a stream that contains the number of datasets and can simply be read out.
your line 249 in function decode_shimadzu_block: buffer[[2]] <- twos_complement(substr(bin, 5, nchar(bin))),
This line cuts off the first 4 bits of the bit string that finally contains the number of the difference to the former value. It worked this way for my PDA data, but could not reproduce the results of the fluoremeter at some positions and distorted the signal. I needed some time to understand this but at the end the funstion simply failed when the value for the difference is a large number and mpre bytes are needed to decode it. At the end I simple reduced the cut and are using the bits from position 3. This seemed to work!
My question here is: did you find the number '5' simply by trial and error, or was there a reason?
If there is interest from your side, I can spend some time to described more details, e.g. where to find the fluorescence data and how to read it or the file size in the .lcd file.
Best
Rüdiger
Hi,
I'm struggling to get read_chroms to accept a path_out, and its unclear what formatting it is looking for. It seems to want to add a "/" to the front of whatever I specify as path_out. The default behaviour is supposed to save it in the current working directory, but it doesn't do that either, it prompts whether to save it in 'temp', but it can't find 'temp' folder even when I manually created it as a subfolder of the working directory. I would prefer to give the full path from the drive name (eg "C:/ ... "), but this doesn't work because its putting a "/" in front of the drive name. See below where I've used getwd() to provide the path to the current directory.
read_chroms(paste0(archive_sample_dirs[1],'/FID1A.ch'),find_files=FALSE, path_out=getwd(), export=TRUE, parser="openchrom", format_in='csd',export_format = 'csv')
Error in read_chroms(paste0(archive_sample_dirs[1], "/FID1A.ch"), find_files = FALSE, :
The export directory '/W:/ARL/Analytical/OPERATOR METHOD TEMPLATES/chemstation data/' does not exist.
If I make path_out = "", it ignores this problem at least for long enough to encounter an additional problem, which is that it can't find the path to the OpenChrom command line. It seems to want the pathname with filename, sans extension, eg "C:/Users/ ... /Programs/OpenChrom/openchrom" for this. If I type this in, it moves on to another error (I suspect its back to the first error). It seems to save this path in the path_to_openchrom_commandline.txt file, but it doesn't seem to be able to find OpenChrom unless I manually type it in each time. When I do so, I get:
Error in write_xml.xml_document(x, file = path_xml) : Error closing file
In addition: Warning message:
In write_xml.xml_document(x, file = path_xml) : Permission denie [1501]
Python environment sometimes fails to load (due to issues building C++ extensions?) as discussed in issue #7. I'm not sure what can be done about this directly, but at least providing better documentation about what the actual requirements are would probably be helpful.
I am seeking a chemstation version 181 file to use for unit testing. If anyone has a file they wouldn't mind contributing, I would be grateful.
I don't want to be so dependent on the python and rust-based dependencies for interpreting Chemstation UV files. I think it shouldn't be too difficult to write a parser directly in R, following the documentation kindly provided in the rainbow-api package (https://rainbow-api.readthedocs.io/en/latest/agilent/uv.html).
The new read_chemstation_fid
parser can read some of the newer chemstation 181 .ch
files supplied by Phenomniverse but the older files seem to be encoded differently and aren't being interpreted correctly. Also see issue #6. (These 181 files are apparently not able to be read by any of the external libraries currently included with chromConverter either).
Apologies in advance, this is my first time posting a issue on GitHub. I am trying to use read_chroms()
to convert Agilent .ch
WSD files to .csv
, however I cannot get read_chroms()
to find the openchrom
executable. Also openchrom
CLI is not behaving as expected.
OS: MacOS Ventura 13.1
Environment: R interpreter in zsh.
Executing:
dat <- read_chroms(paths = file_path, format_in = 'wsd', parser = 'chromconverter', format_out = "data.frame", export = FALSE)
Produces the following error dialog:
Export directory not specified! Export files to `temp` directory (y/n)?y
Warning in configure_call_openchrom() : OpenChrom not found!
Please provide path to `OpenChrom` command line):/Applications/Eclipse.app/contents/MacOS/openchrom
The OpenChrom command-line interface is turned off!
Update `openchrom.ini` to activate the command-line interface (y/n)?
(Warning: This will deactivate the GUI on your OpenChrom installation!)
y
sh: /Applications/OpenChrom_CL.app/Contents/MacOS/openchrom: No such file or directory
Error in file(file, "rt") : cannot open the connection
In addition: Warning messages:
1: In system(paste0(openchrom_path, " -nosplash -cli -batchfile ", :
error in running command
2: In file(file, "rt") :
cannot open file '/Users/jonathan/001_chromconverter_test_env/temp/DAD1D.csv': No such file or directory
Then running it again produces a different message:
Export directory not specified! Export files to `temp` directory (y/n)?y
sh: /Applications/OpenChrom_CL.app/Contents/MacOS/openchrom: No such file or directory
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
no lines available in input
In addition: Warning message:
In system(paste0(openchrom_path, " -nosplash -cli -batchfile ", :
error in running command
Manually setting up OpenChrom CLI and running ./openchrom -nosplash -cli --help
results in an error message as below, instead of the help dialog:
WARNING: Using incubator modules: jdk.incubator.foreign, jdk.incubator.vector
<<<< EncryptedJarClassLoader created >>>>
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Which I don't think is a good thing.
Investigating the openchrom path, I found that
../chromConverter/shell/path_to_openchrom_commandline.txt
contained
/Applications/Eclipse.app/contents/MacOS/openchrom
the path I initally inputted, so that's behaving as expected, however I'm not sure why read_chrom()
is looking in a different path. Any help would be appreciated.
Hi Ethan-
I am importing chemstation .ch files, and for a significant portion of my files I get NaN values for intensity. Typically I have four .ch files per sample run from different wavelengths, and sometimes it happens for all four and sometimes only some of them. I can open the files without issue in ChemStation, so I know the files contain data. Unfortunately we don't have .uv files for this particular dataset.
I am running chromConverter 0.4.2
Github won't let me attach .ch files, but there are some example offensive files here if you would like to check them out:
https://drive.google.com/drive/folders/1dhJe-JdV3ilXz_a_NBGWJ6COX4ENocxz?usp=sharing
With these four example files (all different wavelengths from the same run, I can read in Signal C and D but not A and B. Here is some example code:
`chroms <- read_chroms(paths="C:/Chem32/1/DATA/Loren RV/PHENOLICS 2022 2022-02-03 14-31-44", format_in = "chemstation_ch")
head(chroms[[1]])
head(chroms[[3]]) `
Any tips appreciated!!
Susan
Hi, is it possible to add a parser for Shimadzu .lcd files? I have attached a file example for reference.
Best, Silas
Anthocyanin_2_MeOH001.zip
Hi Ethan,
Not sure what has happened because this was working previously, but now I'm not able to load the chromConverter package.
From the error message I gather that there are some issues with the python package 'python-lzf' installation process, looks like it needs an install of Microsoft Visual C++, and also I also see this warning:
SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
I'll try manually installing the build tools for Microsoft Visual C++ and see if that resolves the issue, but in the mean time, here's the full output of library(chromConverter):
> library(chromConverter)
Configuring package 'chromConverter': please wait ...
C:\users>CALL "C:\Users\regan\AppData\Local\r-miniconda\condabin\activate.bat" "C:\Users\regan\AppData\Local\r-miniconda\envs\r-reticulate"
C:\users>conda.bat activate "C:\Users\regan\AppData\Local\r-miniconda\envs\r-reticulate"
(r-reticulate) C:\users>"C:/Users/regan/AppData/Local/r-miniconda/envs/r-reticulate/python.exe" -m pip install --upgrade --no-user "aston" "numpy" "pandas" "rainbow-api" "scipy"
Collecting aston
Using cached Aston-0.7.1-py3-none-any.whl (74 kB)
Requirement already satisfied: numpy in c:\users\regan\appdata\local\r-miniconda\envs\r-reticulate\lib\site-packages (1.23.5)
Collecting pandas
Using cached pandas-1.5.2-cp38-cp38-win_amd64.whl (11.0 MB)
Collecting rainbow-api
Using cached rainbow_api-1.0.1-py3-none-any.whl (21 kB)
Collecting scipy
Using cached scipy-1.9.3-cp38-cp38-win_amd64.whl (39.8 MB)
Requirement already satisfied: pytz>=2020.1 in c:\users\regan\appdata\local\r-miniconda\envs\r-reticulate\lib\site-packages (from pandas) (2022.6)
Collecting python-dateutil>=2.8.1
Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting python-lzf
Using cached python-lzf-0.2.4.tar.gz (9.3 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting matplotlib
Using cached matplotlib-3.6.2-cp38-cp38-win_amd64.whl (7.2 MB)
Collecting lxml
Using cached lxml-4.9.1-cp38-cp38-win_amd64.whl (3.6 MB)
Collecting six>=1.5
Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting cycler>=0.10
Using cached cycler-0.11.0-py3-none-any.whl (6.4 kB)
Collecting kiwisolver>=1.0.1
Using cached kiwisolver-1.4.4-cp38-cp38-win_amd64.whl (55 kB)
Collecting pillow>=6.2.0
Using cached Pillow-9.3.0-cp38-cp38-win_amd64.whl (2.5 MB)
Collecting fonttools>=4.22.0
Using cached fonttools-4.38.0-py3-none-any.whl (965 kB)
Collecting pyparsing>=2.2.1
Using cached pyparsing-3.0.9-py3-none-any.whl (98 kB)
Collecting contourpy>=1.0.1
Using cached contourpy-1.0.6-cp38-cp38-win_amd64.whl (163 kB)
Collecting packaging>=20.0
Using cached packaging-21.3-py3-none-any.whl (40 kB)
Building wheels for collected packages: python-lzf
Building wheel for python-lzf (setup.py): started
Building wheel for python-lzf (setup.py): finished with status 'error'
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [5 lines of output]
running bdist_wheel
running build
running build_ext
building 'lzf' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for python-lzf
Running setup.py clean for python-lzf
Failed to build python-lzf
Installing collected packages: python-lzf, six, scipy, pyparsing, pillow, lxml, kiwisolver, fonttools, cycler, contourpy, python-dateutil, packaging, aston, pandas, matplotlib, rainbow-api
Running setup.py install for python-lzf: started
Running setup.py install for python-lzf: finished with status 'error'
error: subprocess-exited-with-error
× Running setup.py install for python-lzf did not run successfully.
│ exit code: 1
╰─> [7 lines of output]
running install
C:\Users\regan\AppData\Local\r-miniconda\envs\r-reticulate\lib\site-packages\setuptools\command\install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running build
running build_ext
building 'lzf' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> python-lzf
note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
Error: package or namespace load failed for ‘chromConverter’:
.onLoad failed in loadNamespace() for 'chromConverter', details:
call: NULL
error: Error installing package(s): "\"aston\"", "\"numpy\"", "\"pandas\"", "\"rainbow-api\"", "\"scipy\""
In addition: Warning message:
In shell(fi, intern = intern) :
'C:\Users\regan\AppData\Local\Temp\RtmpINwiN5\file2c3c7541260c.bat' execution failed with error code 1
Recent versions of RStudio made some strange changes to the way reticulate functions, as discussed in this thread, which interfere with chromConverter's python bindings. chromConverter will still load but python-based parsers will likely not be available if a project is loaded. When trying to access python parsers, a module not found
error will be generated. As far as I can tell, this is a bug with RStudio rather than chromConverter (though RStudio developers seem to think this is the expected behavior).
This issue can apparently be resolved by unchecking a box in the RStudio settings. To do this, open RStudio settings and navigate to the Python pane (Tools:Global Options:Python
). Then uncheck the box that says "Automatically activate project-local Python environments" and click Apply
. RStudio must then be restarted for the settings to take effect.
Hi,
There is a problem with the read_chrom function, a "/" is added at the beginning and end of path_out which prevents the function from working properly.
if i try thise code (on Windows):
library(chromConverter)
if (dir.exists("DATAs/OUTPUT/neg")){
print("THE DIRECTORY EXIST")
}
dat <- read_chroms(paths = "DATAs/INPUT/neg", format_in = "thermoraw", path_out = "DATAs/OUTPUT/neg")
i have this output:
library(chromConverter)
if (dir.exists("DATAs/OUTPUT/neg")){
print("THE DIRECTORY EXIST")
}
1] "THE DIRECTORY EXIST"dat <- read_chroms(paths = "DATAs/INPUT/neg", format_in = "thermoraw", path_out = "DATAs/OUTPUT/neg")
Error in read_chroms(paths = "DATAs/INPUT/neg", format_in = "thermoraw", :
The export directory '/DATAs/OUTPUT/neg/' does not exist.
Hi dear developers,
We want to use your tools to generate chromatograms for our article.
We have data in mzXML format for which we do not have the constructor files.
It seems that your tool does not support this format. Would it be possible to add it?
thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.