Git Product home page Git Product logo

gengraph's People

Contributors

grncam007 avatar jambler24 avatar phillipswanepoel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

gengraph's Issues

Work on the ancestral genome function

This function was created to test two things, the creation of a similarity matrix based on shared nodes, and creating a consensus genome from the nodes that are most often taken through the graph, weighted based on the similarity matrix so that multiple closely related species are weighted down.

The similarity matrix is not being created properly, and the resultant trees are not a true representation of the phylogeny.

Novice to python

I am having this problem can you help please? Thank you
devina@devina-HP-ProBook-450-G3:~/GenGraph$ python3 gengraphTool.py make_genome_graph --seq_file analysis.txt --out_file_name test --recreate_check
Traceback (most recent call last):
File "gengraphTool.py", line 83, in
parsed_input_dict = parse_seq_file(args.seq_file)
File "/home/devina/GenGraph/gengraph.py", line 548, in parse_seq_file
A_seq_label_dict[a_seq_file['aln_name']] = a_seq_file['seq_name']
KeyError: 'seq_name'

Substrings of other isolate names as isolate names

If a substring of an isolate is also the name of another isolate, it will result in an error. This is seen in the example of if one sequence is isolate "CDC1551" and another is "C", then an error during refine_initGraph will occur. This is most likely due to the line
if isolate in data['ids']:
and should be replaced with a more strict check.

[Question] vg versus GenGraph

Hi!

i am quite new in this field and i struggle to understand the difference between vg and this library, can you help me?

Thank you in advance and thank you for this project!

[BUG?] Error: progressiveMauve_call error: output of progressiveMauve empty

When i run this script:
python ./gengraphTool.py make_genome_graph --seq_file TestGraphs/sequences.txt --out_file_name test
sequences.txt:

seq_name	aln_name	seq_path	annotation_path
H37Rv	seq0	/Users/filippo/Desktop/workspace/GenGraph/TestGraphs/H37Rv.fa	NA
H37Rv1	seq1	/Users/filippo/Desktop/workspace/GenGraph/TestGraphs/H37Rv1.fa	Na
H37Rv2	seq2	/Users/filippo/Desktop/workspace/GenGraph/TestGraphs/H37Rv2.fa	N

I got the error:
progressiveMauve_call error: output of progressiveMauve empty

I fixed the error changing the line 2775 in gengraph.py:
old line: number_of_lines = 3 ----- new line: number_of_lines = 2

But i'm not sure about the fix

Thank you!

Gengraph running problem

hi
One question
Why when I start running the program

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name Documents/output
Conducting progressiveMauve
progressiveMauve

It got stuck.

I am using a Mac
Processor 2.7 GHz core intel core i7
Memory 16 GB
Two sequences 4.5 MB each
Thank you for your precious help
Devina

No such file or directory: 'globalAlignment_khush.backbone'

Hey, I am trying to run example code in your repo(sequences.txt) with some modifications in local system but I am having this problem.

$ python3 ./gengraphTool.py make_genome_graph --seq_file sequences.txt --out_file_name khush --recreate_check
Running GenGraph Toolkit
Creating genome graph
[OrderedDict([('seq_name', 'H37Rv'), ('aln_name', 'seq0'), ('seq_path', '/home/noob/Documents/IIITD/tavlab/strainflow/GenGraph-master/TestGraphs/H37Rv.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'H37Rv1'), ('aln_name', 'seq1'), ('seq_path', '/home/noob/Documents/IIITD/tavlab/strainflow/GenGraph-master/TestGraphs/H37Rv1.fa'), ('annotation_path', 'Na')]), OrderedDict([('seq_name', 'H37Rv2'), ('aln_name', 'seq2'), ('seq_path', '/home/noob/Documents/IIITD/tavlab/strainflow/GenGraph-master/TestGraphs/H37Rv2.fa'), ('annotation_path', 'Na')])]
Conducting progressiveMauve
progressiveMauve Complete
Traceback (most recent call last):
  File "./gengraphTool.py", line 136, in <module>
    genome_aln_graph = bbone_to_initGraph(bbone_file, parsed_input_dict)
  File "/home/noob/Documents/IIITD/tavlab/strainflow/GenGraph-master/gengraph.py", line 1616, in bbone_to_initGraph
    backbone_lol = input_parser(bbone_file)
  File "/home/noob/Documents/IIITD/tavlab/strainflow/GenGraph-master/gengraph.py", line 1189, in input_parser
    in_file = open(file_path, 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'globalAlignment_khush.backbone'

Can you please help me out?

Also, I have one more question to ask, Can I make De-Bruijn Directed graph using this library?

Need help running the code.

Hello, I'm a students trying to create a graph similar to figure 3 of the GenGraph paper. I've been trying to get the code to run for more that a week and is always error after errors the latest one is this:

FileNotFoundError: [WinError 2] The system cannot find the file specified

##full code##
C:\Users\eros1\anaconda3\Lib\site-packages\GenGraph>python ./gengraphTool.py make_genome_graph --seq_file C:\Users\eros1\OneDrive\documents\Summer2022_Genome\E.coli_tab.txt --out_file_name test

Conducting progressiveMauve
({'seq0': 'K-12', 'seq1': 'Nissle-1917', 'seq2': 'O157:H7'}, {'K-12': 'C:/Users/eros1/OneDrive/Documents/Summer2022_Genome/E. Coli k-12.fasta', 'Nissle-1917': 'C:/Users/eros1/OneDrive/Documents/Summer2022_Genome/E. Coli Nissle 1917.fasta', 'O157:H7': 'C:/Users/eros1/OneDrive/Documents/Summer2022_Genome/E. Coli O157H7.fasta'}, ['C:/Users/eros1/OneDrive/Documents/Summer2022_Genome/E. Coli k-12.fasta', 'C:/Users/eros1/OneDrive/Documents/Summer2022_Genome/E. Coli Nissle 1917.fasta', 'C:/Users/eros1/OneDrive/Documents/Summer2022_Genome/E. Coli O157H7.fasta'], {'K-12': 'NA', 'Nissle-1917': 'NA', 'O157:H7': 'NA'})

Traceback (most recent call last):
File "./gengraphTool.py", line 87, in
progressiveMauve_alignment(parsed_input_dict[2], args.out_file_name)
File "C:\Users\eros1\anaconda3\Lib\site-packages\GenGraph\gengraph.py", line 1949, in progressiveMauve_alignment
return call(progressiveMauve_call, stdout=open(os.devnull, 'wb'))
File "C:\Users\eros1\anaconda3\lib\subprocess.py", line 340, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\eros1\anaconda3\lib\subprocess.py", line 858, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\eros1\anaconda3\lib\subprocess.py", line 1311, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

Running problem

hi
Issue : Exception FileNotOpened thrown from
Unknown() in gnFileSource.cpp 67
Called by Unknown()
Traceback (most recent call last):
File "./GenGraph/gengraphTool.py", line 102, in module
genome_aln_graph = bbone_to_initGraph(bbone_file, parsed_input_dict)
File "/GenGraph/gengraph.py", line 830, in bbone_to_initGraph
iso_length = len(input_parser(input_dict[1][iso])[0]['DNA_seq'])
TypeError: 'NoneType' object is not subscriptable

specification: docker toolbox windows 10

deepcopy

In the fasta_alignment_to_subnet() function, there is a
copy.deepcopy(true_start)
that according to profiling is taking way too long. A suggested solution is using
g = cPickle.loads(cPickle.dumps(a, -1))
as suggested here:
https://stackoverflow.com/questions/24756712/deepcopy-is-extremely-slow
Will try this first, but otherwise the whole fasta_alignment_to_subnet() function could do with improvement.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.