babinyurii / recan Goto Github PK

View Code? Open in Web Editor NEW

10.0 0.0 5.0 10 MB

genetic distance plotting for recombination events analysis

License: MIT License

Python 82.99% TeX 17.01%

virology bioinformatics python genetic-distances dna-recombination alignment recombination-events distance-plots

recan's People

Contributors

Stargazers

Forkers

vikash84 will-rowe anmolter qinlab emollier

recan's Issues

JoSS Review: Installation

In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.

As the community migrates/upgrades to using Jupyter Lab, you may want to consider including instructions on how to install the Jupyter Lab Plotly renderer: https://plot.ly/python/getting-started/#jupyterlab-support-python-35 to meet your package's visualization needs.

save_data: TypeError: 'NoneType' object is not subscriptable

Describe the bug
When using the save_data method I'm getting an error saying TypeError: 'NoneType' object is not subscriptable

Full error:

Traceback (most recent call last):
  File "test.py", line 7, in <module>
    sim_obj.save_data(out="csv", out_name="test_data")
  File "/home/jeremy/.local/lib/python3.8/site-packages/recan/simgen.py", line 293, in save_data
    df = pd.DataFrame(data=self._distance, index=self._ticks[1:]).T
TypeError: 'NoneType' object is not subscriptable

To Reproduce

from recan.simgen import Simgen
sim_obj = Simgen("test.fasta")
sim_obj.save_data(out="csv", out_name="test_data")

Expected behavior
CSV file containing plot data is written

Additional context
Add any other context about the problem here.

recan R package

Hello
your tool seems very interesting. I would like to evaluate it via R. I saw your protocol :
https://www.protocols.io/view/recan-r-based-tool-for-detection-of-recombination-dm6gpwd1plzp/v2
but I can't find how to install it...
(using a Mac book Pro 2022)

thank you

ValueError: Sequences must all be the same length

Describe the bug
A clear and concise description of what the bug is.

When we added two distinct sequences of viruses, we got an error of the same length

Here is our code

from recan.simgen import Simgen
sim_obj = Simgen("./virus_genome/all_virus_genome.fasta")

Here is the error we have received.

~/anaconda3/lib/python3.7/site-packages/Bio/Align/init.py in _append(self, record, expected_length)
589 # raise ValueError("New sequence is not of length %i"
590 # % self.get_alignment_length())
--> 591 raise ValueError("Sequences must all be the same length")
592
593 # Using not self.alphabet.contains(record.seq.alphabet) needs fixing

ValueError: Sequences must all be the same length

JoSS Review: Performance

In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.

In your paper, you have stated that one of the advantages of recan over other methods of recombination detection (e.g. RAT and RDP4) is speed. However, I do not think the speed of recan should be compared to that of the aforementioned programs as the recombination auto-search function (which consumes a lot of run-time and is the main focus of RAT and RDP4) has not been implemented in recan.

Distance Calculation

As it stands, simgen only allows pdist and k2p as its distance formula, but your program doesn't allow to user to define their own: such as a Jukes-Cantor or Tajima-Nei distance.

Would it be possible to allow simgen to take a function as an argument for its dist= kwarg?

JoSS Review: Community Guidelines

In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.

In the spirit of Open Source, programs submitted to JoSS should iterate on the concept of developing a Community of Practice. Therefore, most submissions should contain a set of Community Guidelines. These guidelines should detail how to contribute, how to report issues, and how to seek support.

These can easily be detailed in a CONTRIBUTING.md file at the top-level of the package and consider setting up Issue/PR templates for your repository.

k2p: ZeroDivisionError: float division by zero

Hi developer,

I'm trying to run recan to detect a potential recombination event. However, after I changed the 'dist=k2p', the program crashed out. But the default pdist could normally run. Could you please help figure it out?

Here is the running output:

Traceback (most recent call last):
  File "test_recan.py", line 6, in <module>
    sim_obj.simgen(window=200, shift=20, pot_rec=1,  dist='k2p')
  File "/home/zjl/anaconda3/lib/python3.7/site-packages/recan/simgen.py", line 241, in simgen
    self._distance = self._move_window(window, pot_rec, shift, dist)
  File "/home/zjl/anaconda3/lib/python3.7/site-packages/recan/simgen.py", line 117, in _move_window
    distance = self._K2Pdistance(seq1, seq2)
  File "/home/zjl/anaconda3/lib/python3.7/site-packages/recan/simgen.py", line 153, in _K2Pdistance
    p = float(ts_count) / length
ZeroDivisionError: float division by zero

My MSA squences have some N bases and degenerate bases, is this the reason? How could prepare the input alignment? Thanks.

JoSS Review: Example Usage

In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.

While I appreciate the wealth of examples that were added to the documentation, none of these examples explain what is being visualized thoroughly. A lay-person, or early scientist in the field, may look at these plots and not understand where the events are occurring and what they look like. Consider adding brief annotations to the screen shots explaining what/where these events are.

JoSS Review: Statement of need

In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.

My only comment here is that a clear problem that recan solves. I do see that it is meant for researchers that use Python and that it has some advantages over RDP4 and RAT. However, a concise statement of need seems absent.

JoSS Review: Quality of writing

In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.

This is potentially the most difficult issue to bring up. While the implementation of recan is tremendous work, the readability of this submission could use significant revisions.

I planned on making a series of revision notes to pass on to the author but, after looking at the list, I will refrain as I know what it feels like to have a giant list of revisions to come from a reviewer.

Generally speaking, this manuscript would benefit from improvements in grammar and sentence structure.

JoSS Review: Automated tests

In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.

Regarding automated testing, I appreciate that you already have some in place. However, there are some points to be made:

You do not detail how to run the automated tests in the instructions/README
One of the industry leaders in running automated tests is pytest, and if a naive user were to attempt to run your test suite using it, the naming conventions you have in place for the test folder and files would inhibit test discovery:
- Consider renaming test/ to tests/ and test.py to recan_test.py or basic_test.py
Your tests require the user to run tests from the test/ folder itself. Running the automated tests from the top-level directory causes tests to fail. If this behavior is desired, please detail it in the instructions for running the tests.
Lastly, with regard to test coverage:

$ pytest --cov=recan .
================================================= test session starts ==================================================platform linux -- Python 3.6.7, pytest-5.3.2, py-1.8.1, pluggy-0.12.0
rootdir: /.../recan/recan
plugins: cov-2.8.1
collected 13 items

recan_test.py .............                                                                                      [100%]       
=================================================== warnings summary ===================================================/.../miniconda3/envs/recan/lib/python3.6/site-packages/Bio/Alphabet/__init__.py:26
  /.../miniconda3/envs/recan/lib/python3.6/site-packages/Bio/Alphabet/__init__.py:26: PendingDeprecationWarning:

  We intend to remove or replace Bio.Alphabet in 2020, ideally avoid using it explicitly in your code. Please get in touch if you will be adversely affected by this. https://github.com/biopython/biopython/issues/2046

-- Docs: https://docs.pytest.org/en/latest/warnings.html

----------- coverage: platform linux, python 3.6.7-final-0 -----------
Name                                  Stmts   Miss  Cover
---------------------------------------------------------
/.../recan/recan/recan/__init__.py       0      0   100%
/.../recan/recan/recan/simgen.py       147     48    67%
---------------------------------------------------------
TOTAL                                   147     48    67%

============================================ 13 passed, 1 warning in 3.23s ============================================= 11:32

JoSS Review: State of the field

In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.

While recan allows the plotting of genetic distances (much like RDP4 and RAT), and it does so interactively and ad hoc, it foregoes the ability to detect potential recombination events. This is one of the primary reasons others use those programs.

Are there any future plans to implement potential recombination event detection into recan?

JoSS Review: README Documentation

In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.

While your GitHub repo contains a README file, this file is not included in your package on PyPI. As it stands, any person looking for your package through that venue will not be able to view your package details unless they were to navigate to the repo itself. Consider packaging your README with your PyPI package: https://packaging.python.org/guides/making-a-pypi-friendly-readme/

Copyright statement

Hi,

I'm trying to packaging this into Debian as a Debian package, and now I'm writing the copyright information.

recan/LICENCE.txt

Line 1 in 45c5a61

I found the copyright statement in LICENCE.txt which seems incorrect. (The Python Packaging Authority (PyPA) suppose a template or placeholder?)

Could you please let me know this statement is correct or not? If not, could you please fix this?

Thank you~

babinyurii / recan Goto Github PK

recan's People

Contributors

Stargazers

Forkers

recan's Issues

Recommend Projects

Recommend Topics

Recommend Org