babinyurii / recan Goto Github PK
View Code? Open in Web Editor NEWgenetic distance plotting for recombination events analysis
License: MIT License
genetic distance plotting for recombination events analysis
License: MIT License
In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.
As the community migrates/upgrades to using Jupyter Lab, you may want to consider including instructions on how to install the Jupyter Lab Plotly renderer: https://plot.ly/python/getting-started/#jupyterlab-support-python-35 to meet your package's visualization needs.
Describe the bug
When using the save_data
method I'm getting an error saying TypeError: 'NoneType' object is not subscriptable
Full error:
Traceback (most recent call last):
File "test.py", line 7, in <module>
sim_obj.save_data(out="csv", out_name="test_data")
File "/home/jeremy/.local/lib/python3.8/site-packages/recan/simgen.py", line 293, in save_data
df = pd.DataFrame(data=self._distance, index=self._ticks[1:]).T
TypeError: 'NoneType' object is not subscriptable
To Reproduce
from recan.simgen import Simgen
sim_obj = Simgen("test.fasta")
sim_obj.save_data(out="csv", out_name="test_data")
Expected behavior
CSV file containing plot data is written
Additional context
Add any other context about the problem here.
Hello
your tool seems very interesting. I would like to evaluate it via R. I saw your protocol :
https://www.protocols.io/view/recan-r-based-tool-for-detection-of-recombination-dm6gpwd1plzp/v2
but I can't find how to install it...
(using a Mac book Pro 2022)
thank you
Describe the bug
A clear and concise description of what the bug is.
When we added two distinct sequences of viruses, we got an error of the same length
Here is our code
from recan.simgen import Simgen
sim_obj = Simgen("./virus_genome/all_virus_genome.fasta")
Here is the error we have received.
~/anaconda3/lib/python3.7/site-packages/Bio/Align/init.py in _append(self, record, expected_length)
589 # raise ValueError("New sequence is not of length %i"
590 # % self.get_alignment_length())
--> 591 raise ValueError("Sequences must all be the same length")
592
593 # Using not self.alphabet.contains(record.seq.alphabet) needs fixing
ValueError: Sequences must all be the same length
In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.
In your paper, you have stated that one of the advantages of recan
over other methods of recombination detection (e.g. RAT and RDP4) is speed. However, I do not think the speed of recan
should be compared to that of the aforementioned programs as the recombination auto-search function
(which consumes a lot of run-time and is the main focus of RAT and RDP4) has not been implemented in recan
.
As it stands, simgen
only allows pdist
and k2p
as its distance formula, but your program doesn't allow to user to define their own: such as a Jukes-Cantor or Tajima-Nei distance.
Would it be possible to allow simgen
to take a function as an argument for its dist=
kwarg?
In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.
In the spirit of Open Source, programs submitted to JoSS should iterate on the concept of developing a Community of Practice. Therefore, most submissions should contain a set of Community Guidelines. These guidelines should detail how to contribute, how to report issues, and how to seek support.
These can easily be detailed in a CONTRIBUTING.md file at the top-level of the package and consider setting up Issue/PR templates for your repository.
Hi developer,
I'm trying to run recan
to detect a potential recombination event. However, after I changed the 'dist=k2p', the program crashed out. But the default pdist
could normally run. Could you please help figure it out?
Here is the running output:
Traceback (most recent call last):
File "test_recan.py", line 6, in <module>
sim_obj.simgen(window=200, shift=20, pot_rec=1, dist='k2p')
File "/home/zjl/anaconda3/lib/python3.7/site-packages/recan/simgen.py", line 241, in simgen
self._distance = self._move_window(window, pot_rec, shift, dist)
File "/home/zjl/anaconda3/lib/python3.7/site-packages/recan/simgen.py", line 117, in _move_window
distance = self._K2Pdistance(seq1, seq2)
File "/home/zjl/anaconda3/lib/python3.7/site-packages/recan/simgen.py", line 153, in _K2Pdistance
p = float(ts_count) / length
ZeroDivisionError: float division by zero
My MSA squences have some N
bases and degenerate bases
, is this the reason? How could prepare the input alignment? Thanks.
In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.
While I appreciate the wealth of examples that were added to the documentation, none of these examples explain what is being visualized thoroughly. A lay-person, or early scientist in the field, may look at these plots and not understand where the events are occurring and what they look like. Consider adding brief annotations to the screen shots explaining what/where these events are.
In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.
My only comment here is that a clear problem that recan
solves. I do see that it is meant for researchers that use Python and that it has some advantages over RDP4 and RAT. However, a concise statement of need seems absent.
In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.
This is potentially the most difficult issue to bring up. While the implementation of recan
is tremendous work, the readability of this submission could use significant revisions.
I planned on making a series of revision notes to pass on to the author but, after looking at the list, I will refrain as I know what it feels like to have a giant list of revisions to come from a reviewer.
Generally speaking, this manuscript would benefit from improvements in grammar and sentence structure.
In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.
Regarding automated testing, I appreciate that you already have some in place. However, there are some points to be made:
test/
to tests/
and test.py
to recan_test.py
or basic_test.py
test/
folder itself. Running the automated tests from the top-level directory causes tests to fail. If this behavior is desired, please detail it in the instructions for running the tests.$ pytest --cov=recan .
================================================= test session starts ==================================================platform linux -- Python 3.6.7, pytest-5.3.2, py-1.8.1, pluggy-0.12.0
rootdir: /.../recan/recan
plugins: cov-2.8.1
collected 13 items
recan_test.py ............. [100%]
=================================================== warnings summary ===================================================/.../miniconda3/envs/recan/lib/python3.6/site-packages/Bio/Alphabet/__init__.py:26
/.../miniconda3/envs/recan/lib/python3.6/site-packages/Bio/Alphabet/__init__.py:26: PendingDeprecationWarning:
We intend to remove or replace Bio.Alphabet in 2020, ideally avoid using it explicitly in your code. Please get in touch if you will be adversely affected by this. https://github.com/biopython/biopython/issues/2046
-- Docs: https://docs.pytest.org/en/latest/warnings.html
----------- coverage: platform linux, python 3.6.7-final-0 -----------
Name Stmts Miss Cover
---------------------------------------------------------
/.../recan/recan/recan/__init__.py 0 0 100%
/.../recan/recan/recan/simgen.py 147 48 67%
---------------------------------------------------------
TOTAL 147 48 67%
============================================ 13 passed, 1 warning in 3.23s ============================================= 11:32
In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.
While recan
allows the plotting of genetic distances (much like RDP4 and RAT), and it does so interactively and ad hoc, it foregoes the ability to detect potential recombination events. This is one of the primary reasons others use those programs.
Are there any future plans to implement potential recombination event detection into recan
?
In conjunction with the review of your package to JoSS (available at openjournals/joss-reviews#2014), here is an issue for you to address for your submission.
While your GitHub repo contains a README file, this file is not included in your package on PyPI. As it stands, any person looking for your package through that venue will not be able to view your package details unless they were to navigate to the repo itself. Consider packaging your README with your PyPI package: https://packaging.python.org/guides/making-a-pypi-friendly-readme/
Hi,
I'm trying to packaging this into Debian as a Debian package, and now I'm writing the copyright information.
Line 1 in 45c5a61
I found the copyright statement in LICENCE.txt
which seems incorrect. (The Python Packaging Authority (PyPA)
suppose a template or placeholder?)
Could you please let me know this statement is correct or not? If not, could you please fix this?
Thank you~
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.