opengene / repaq Goto Github PK
View Code? Open in Web Editor NEWA fast lossless FASTQ compressor with ultra-high compression ratio
License: MIT License
A fast lossless FASTQ compressor with ultra-high compression ratio
License: MIT License
repaq使用的俩示例文件是啥?DNA/RNA?人/其他物种?NA12878?想拿这个数据尝试一些好玩的东西,希望了解这个数据的更多信息
% repaq --version
0.2.0
and return 0.
Hello,
Recently, I used repaq to compress a fastq file and then tried to check the result. When I used this cmd: "repaq -p -i xxxxxxx.fq.gz -r test.rfq -j test.txt", it returned "Segmentation fault (core dumped)" and stopped. Could anyone help me to fix it? Thx~
big endian and small endian
I asked for --help
so it should return 0 (success) not 1.
Does it need install -d -t PREFIX/bin src/repaq
?
I get the following error message immediately:
xz: (stdin): Cannot allocate memory
ERROR: failed to call xz, please confirm that xz is installed in your system
when I run:
sbatch -c 16 --mem 10000 --wrap "repaq -c -i file_1_trimmed.fastq.gz -I file_2_trimmed.fastq.gz -o file.fpq.xz --compression 9 --thread 16"
I installed repaq and xz with the following commands in a fresh conda environment:
conda install -c bioconda repaq
conda install conda-forge::xz
Help would be much appreciated
Maybe change to install: $(TARGET)
so that make install
does build and install?
Hi! We're considering using repaq due to the significant space savings, but have some concerns about lossless-ness. Some of the other issues (#11, #12, #13) seem to suggest that repaq isn't lossless, but there isn't enough detail on some (#12, #13) to determine the severity.
How production ready is repaq?
Hello, I recently found some problems using repaq-0.3.0.
After decompressing, I found that the md5 check failed. Comparing the decompressed fastq file with the original file, it was found that at some point in some reads N became G after being compressed. Is this a machine problem or an algorithm problem? I compressed 160 files and 20 of them were problematic. Recompressing these 20 is still the same result.
Here is the result of the --compare parameter
"result":"failed", "msg":"The RFQ file and FASTQ file have different sequence in the 7815 pair. GGACCTCTTCTGACTGATGGGAAATCACAGCAGTTGGAGACCCAGGTCCACAGGAAGGATGAAGAACCCAAGGAATGGCAGCAGACTGAGAGCTTCTGGA | GGACCTCTTCTGACTGATGGGAAATCACAGCAGTTGGAGACCCAGGTCCACAGGAAGGATGAAGAACCCAAGNAATGGCAGCAGACTGAGAGCTTCTGGA"
brew install brewsci/bio/repaq
On some fq files, the quality string is not preserved
in a round-trip conversion. In particular ;
or :
gets replaced by I
.
One example line, first line is original:
-BBDADFHHHHHHII?HHHIIII?EHEHCDHDHGFHHHEFHHHIECD/E?@FH1<@GEGH?GHH?@CC@?E1<D@FGH111G?CHF0CCEG?0<FFE0:F0=
+BBDADFHHHHHHII?HHHIIII?EHEHCDHDHGFHHHEFHHHIECD/E?@FH1<@GEGH?GHH?@CC@?E1<D@FGH111G?CHF0CCEG?0<FFE0IF0=
repaq-command
repaq -c -i original.fq -o original.fq.rfq
since new version of bcl2fastq exist many problems in practical applications, besides, fastp is an amazing utility and it has came into use in my pipelines.
I guess it is a great work that we can develop an alternative tool in the very beginning of the analysis workflow.
Once it works, maybe a new workflow of raw data processing standard is born.
Att.
The chunk should have a individual flag to identify whether encode quality
Hi, thanks for the tool! The compression ratio is IMPRESSIVE, down to 3.1%! But I got the following error using decompression...
"
~/bin/repaq -d -i haha.rfq -o ha_1.fq -O ha_2.fq
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::substr: __pos (which is 404978952) > this->size() (which is 404978876)
Aborted (core dumped)
"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.