Git Product home page Git Product logo

seq_io's People

Contributors

aseyboldt avatar markschl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

seq_io's Issues

Future incompatibility with buf_redux

Hi there,

When building seq_io (0.3.1) with rust 1.71, I get the following warning:

warning: the following packages contain code that will be rejected by a future version of Rust: buf_redux v0.8.4

For context, it appears that later versions of rust will have a breaking change with respect to semi-colons and macros.

This issue also appeared in needletail, and a solution was proposed here onecodex/needletail#64. Basically, there is an updated version of buf_redux: https://crates.io/crates/buffer-redux. This should be a drop-in replacement and fix this issue.

There isn't necessarily a rush for this, more of an FYI for now. But if it is possible to get an updated seq_io so this doesn't appear during builds, that would be very nice.

Thanks!

An error in record.rs in core

How should I solve it? If you can help me, I will be very grateful
Here are the details:
Compiling seq_io v0.4.0-alpha.0 (https://github.com/markschl/seq_io#3d461a36)
error: unexpected token
--> .cargo/git/checkouts/seq_io-5c12caa4159d1858/3d461a3/src/core/record.rs:21:19
|
21 | concat!("use seq_io::", $module_path, "::{Reader, RecordSet};"),
| ^
|
::: /home/xl/.cargo/git/checkouts/seq_io-5c12caa4159d1858/3d461a3/src/fastx/record.rs:454:1
|
454 | impl_recordset!(RefRecord, QualRecordPosition, LineStore, "fastx", "fastq");
| --------------------------------------------------------------------------- in this macro invocation
|
= note: this error originates in the macro impl_recordset (in Nightly builds, run with -Z macro-backtrace for more info)

error: unexpected token
--> /home/xl/.cargo/git/checkouts/seq_io-5c12caa4159d1858/3d461a3/src/core/record.rs:21:19
|
21 | concat!("use seq_io::", $module_path, "::{Reader, RecordSet};"),
| ^
|
::: /home/xl/.cargo/git/checkouts/seq_io-5c12caa4159d1858/3d461a3/src/fasta/record.rs:269:1
|
269 | impl_recordset!(RefRecord, SeqRecordPosition, LineStore, "fasta", "fasta");
| -------------------------------------------------------------------------- in this macro invocation
|
= note: this error originates in the macro impl_recordset (in Nightly builds, run with -Z macro-backtrace for more info)

error: unexpected token
--> /home/xl/.cargo/git/checkouts/seq_io-5c12caa4159d1858/3d461a3/src/core/record.rs:21:19
|
21 | concat!("use seq_io::", $module_path, "::{Reader, RecordSet};"),
| ^
|
::: /home/xl/.cargo/git/checkouts/seq_io-5c12caa4159d1858/3d461a3/src/fastq/record.rs:361:1
|
361 | impl_recordset!(RefRecord, QualRecordPosition, RangeStore, "fastq", "fastq");
| ---------------------------------------------------------------------------- in this macro invocation
|
= note: this error originates in the macro impl_recordset (in Nightly builds, run with -Z macro-backtrace for more info)

Release of 0.4.0

Hello! Is there a timeline for cutting the 0.4.0 release? Specifically I'm finding the read_record_set_exact functionality to be extremely useful.

Depends on old crossbeam

While working on making Rust detect more misuses of uninitialized memory (rust-lang/rust#71274), this crate was flagged as a possible regression via its dependency on crossbeam. That dependency is outdated; it would be great if you could update to the latest crossbeam 0.7 which fixed many critical soundness issues. Thanks. :)

Reading gz files?

Hi Mark and thanks for the great package,
I chose seq io for the speed and was trying to read a gz fastq when I found the pull request and the notes. Originally, I had:
let fq = File::open(fastq_path).expect("Could not open Fastq");
let fq = flate2::read::GzDecoder::new(fq).into_inner();
seq_io::fastq::Reader::new(fq)
to read a fastq.gz. This compiled and I thought great! I tested the application, and it keeps giving me thread 'main' panicked at 'called Result::unwrap()on anErr value: InvalidStart { found: 203, pos: ErrorPosition { line: 1, id: None } }', src/main.rs:60:2

I unzipped the same test file and the same code runs fine, I only input a simple check gz basically here: if matches!(ext.unwrap(), "gz") {} I'm not sure what's going on, I even made sure I added lto=true. It seems to be reading the file but incorrectly, so I tried running buf_redux like in the pull request.. My head hurts so I'll come back to it later, but this is a really exciting prospect! This is the fastest reader so reading from gz would be phenomenal. Any ideas on what the problem could be? cheers

curious error when reading fastq.gz files (maybe not an issue)

Hi,

Many thanks for developing and sharing seq_io!

I recently got a curious case when trying to parse a gziped fastq files sent by one of my collaborators. When using seq_io directly on these files, I get for all them the following error message:

Err value: UnexpectedEnd { pos: ErrorPosition { line: 713, id: None } }

However, if I gunzip these fastq files and re-gzip them, they get parsed as expected.

Would you know what could cause this error? Could it be an incompatibility Windows/Unix (I'm using a Unix system but I don't know how these files have been generated)?

My apologies if this is a trivial question/issue.

parallel reading 10G fasta/fastq file

Dear seq_io team,

I have many 10G fasta/fastq files (10,000) and I want to read each file in parallel (the order of each record in the file does not matter) so that I can accelerate reading all those files. What is the best way you will recommend?

Thanks,

Jianshu

Whole Genome Alignments

Hello,

I'm using seq_io to do some sliding window analysis of a 4Mb genome, but the size of the sequence remains limited. How should I configure my Rust program for the seqeunce size to be about 4Mb long?

write_* takes ownership of Writer

Maybe I'm confused about the API but why do seq_io::fast*::write_* take ownership of the writer? Doesn't this mean you can only ever write one record? This seems problematic.

The simplest reproduction example is:

let mut reader = Reader::from_path("in.fastq")?;
let mut writer = BufWriter::new(File::open(Path::new("out.fastq"))?);
for r in reader.records() {
    let r = r?;
    seq_io::fastq::write_to(writer, &r.head, &r.seq, &r.qual);
}

which doesn't compile since writer was moved.

Issue when get the size of sequence length in reference genome file

Hi developer,
When I used seq_io::fasta::Reader to load reference genome (such as GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna), the size of each chromosome sequence was not corrected (larger than the true sequence length). This is because in reference fasta file the sequence of each chromosome is divided into multiple lines. And I think the size of chromosome sequence in seq_io::fasta::Reader includes all LFs when calculate the sequence length.

Best,
Neng

Open to a PR for impl Display (or Debug) etc.?

Hi - I'm wondering if you'd be open to a PR that added a few functions and impls that I'd find really useful when writing tests using seq_io.

In short I'd like to add a few functions to the Record traits (and/or impls) to return String or str versions of the fields, and then implement either Debug or Display to show the String versions.

It's not a ton of work, but I want to make sure you'd be open to a PR before creating one. I can see how it might be confusing anyone using the library for the first time, but it would make writing tests to much more pleasurable. Right now my tests are littered with calls to String::from_utf8(...) and similar, and I currently have custom assert_eq() functions for the types so that when they don't match the String forms are printed instead of Vecs of u8s.

Happy to do the work if you'd review and ultimately accept a PR.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.