etjean / short_project Goto Github PK
View Code? Open in Web Editor NEWProjet court - assemblage d'un genome
Projet court - assemblage d'un genome
Author : Etienne JEAN Date : 18th september 2018 This program is a de novo genome assembler. It uses Euler cycles in balanced De Bruijn Graphs in order to re-assemble a genome from a file of reads obtained by sequencing. See /doc directory for more information. It runs in python 3. Only the basic libraries are needed. The source code and python executables are in the /src directory. Some test data to run the program are in the /data directory. Some results already generated are in the /results directory All scripts implement a help display. Use the option -h with any of them to show the help. --------------------------------------------------------------------- Examples of commands to run the program. All of the following should be executed in the base directory. I - Generation of a random genome Random genome of size 10kb src/genome_generation.py 10000 > data/random_10kb.fasta Random genome of size 1Mb, with 60% GC content src/genome_generation.py 1000000 --gc-content 0.6 > data/random_1Mb.fasta II - Sequencing simulation of a genome Sequencing with default parameters (reads 100bp, coverage 50) src/sequencing.py data/random_10kb.fasta > data/random_10kb_reads.fasta Sequencing with read length 120bp and coverage 60 src/sequencing.py data/random_1Mb.fasta -l 120 -c 60 > data/random_1Mb_reads.fasta Sequencing the genome of Mycoplasma geniotalium, read length 400bp, coverage 100 src/sequencing.py -l 400 -c 100 data/NC_000908.2.fasta > data/NC_000908.2_reads.fasta III - De novo assembly from a file of reads De novo assembly of the genome of size 10kb, with k-mers length 30bp src/denovo_assembly.py 30 data/random_10kb_reads.fasta results/random_10kb_assembly.fasta > random_10kb_assembly.log De novo assembly of the genome of size 1Mb, with k-mers length 60bp src/denovo_assembly.py 60 data/random_1Mb_reads.fasta results/random_1Mb_assembly.fasta > results/random_1Mb_assembly.log De novo assembly of the genome of Mycoplasma genitalium, k-mers length 250bp src/denovo_assembly.py 250 data/NC_000908.2_reads.fasta results/NC_000908.2_assembly.fasta > results/NC_000908.2_assembly.log IV - Compare the assembly with the reference genome Test between reference and assembly for genome of size 10kb src/test_assembly.py data/random_10kb.fasta results/random_10kb_assembly.fasta Test between reference and assembly for genome of size 1Mb src/test_assembly.py data/random_1Mb.fasta results/random_1Mb_assembly.fasta Test between reference and assembly for genome of Mycoplasma genitalium src/test_assembly.py data/NC_000908.2.fasta results/NC_000908.2_assembly.fasta
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.