Git Product home page Git Product logo

Comments (3)

peterk87 avatar peterk87 commented on August 28, 2024

Hi @PierreLyons Thanks for taking the time to report your issue!

Would you happen to be able to attach/copy-paste the contents of parse_influenza_blast_results.log for this analysis?

The parse_influenza_blast_results.py Python script makes use of Polars, Pandas and NumPy which may be trying to use functions that rely on certain CPU instructions. Or there could just be a bug in the script and the full stack trace might be helpful.

Have you tried Conda/Mamba instead of Podman/Docker?

from nf-flu.

PierreLyons avatar PierreLyons commented on August 28, 2024

Hi @peterk87

Thanks for the quick response.

I haven't tried Conda yet, it's next on my list to try. I'll update once I do.

I'm also struggling to find the parse_influenza_black_results.log file, any ideas where it is stored? Thanks.

from nf-flu.

PierreLyons avatar PierreLyons commented on August 28, 2024

Update regarding this issue:

Note: the same issue happens using either Docker or Conda/Mamba.

I've managed to isolate the issue to the use of polars, which seems to have been added to parse_influenza_blast_results.py in revision 3.2.0, which is when the pipeline breaks on my system. polars uses AVX, which my cpu doesn't support. The polars team has created a legacy version of polars (polars-lts-cpu) which was complied without AVX, and can be installed via pip (not available within conda-forge).

As a quick patch, I've manually created the conda env used by the subtyping_report.nf module with all dependencies except for polars, and then pip installed polars-lts-cpu within that environment. I then simply point the subtyping_report.nf module to the custom environment.

This is working well (almost, see below) now as a patch. I've also requested newer hardware, which I think is the most sensible solution to this issue.

The SUBTYPING_REPORT module completed successfully, but then the SOFTWARE_VERSIONS module threw an error.

I will describe it quickly here as well as my fix, but let me know if you'd like me to open a new issue. I'm not sure if my above fix caused this new issue.

dumpsoftwareversions.py caused this error (truncated for readability):
ERROR ~ Error executing process > 'NF_FLU:ILLUMINA:SOFTWARE_VERSIONS (1)'

Caused by:
Process NF_FLU:ILLUMINA:SOFTWARE_VERSIONS (1) terminated with an error exit status (1)

Command executed [/home/vitalite/.nextflow/assets/CFIA-NCFAD/nf-flu/./workflows/../modules/nf-core/modules/custom/dumpsoftwareversions/templates/dumpsoftwareversions.py]:

[ ... error truncated for readability ... ]

yaml.scanner.ScannerError: mapping values are not allowed here
in "collated_versions.yml", line 2, column 13

For reference, here are the first two lines of collated_versions.yml (second line truncated for readability):
"NF_FLU:ILLUMINA:CAT_ILLUMINA_FASTQ":
cat: Usage: cat [OPTION]... [FILE]... Concatenate FILE(s) to standard output. W [...]

This issue seems to be with the ":" after Usage.

My fix was to add a sed to remove colons from the echo commands at lines 103 and 104 in the nf-flu/modules/local/cat_illumina_fastq.nf file:

original:
cat: $(echo $(cat --help 2>&1) | sed 's/ (.//')
gzip: $(echo $(gzip --help 2>&1) | sed 's/ (.
//')

modified:
cat: $(echo $(cat --help 2>&1) | sed 's/ (.//' | sed 's/://')
gzip: $(echo $(gzip --help 2>&1) | sed 's/ (.
//' | sed 's/://')

This has solved the issue and now the pipeline runs (note: I can only run the pipeline with Conda).

Once again. let me know if I should open a new issue.
Thanks,
Pierre

from nf-flu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.