Git Product home page Git Product logo

opengene.jl's Introduction

This project is no longer maintained

OpenGene

OpenGene.jl project aims to provide basic functions and rich utilities to analyze sequencing data, with the beautiful language Julia

If you want to be an author of OpenGene, please open an issue, or make a pull request.

If you are looking for BAM/SAM read/write, see OpenGene/HTSLIB
Bug reports and feature requests, please file an issue

Julia

Julia is a fresh programming language with C/C++ like performance and Python like simple usage
On Ubuntu, you can install Julia by sudo apt-get install julia, and type julia to open Julia interactive prompt. Details to install Julia is at platform specific instructions.

Add OpenGene

# run on Julia REPL
Pkg.add("OpenGene")

If you want to get the latest dev version of OpenGene (not for beginners)

Pkg.checkout("OpenGene")

This project is under active developing, remember to update it to get newest features:

Pkg.update()

Examples

sequence operation

julia> using OpenGene

julia> seq = dna("AAATTTCCCGGGATCGATCGATCG")
dna:AAATTTCCCGGGATCGATCGATCG
# reverse complement operator
julia> ~seq
dna:CGATCGATCGATCCCGGGAAATTT
# transcribiton, note that seq is treated as coding sequence, not template sequence
# so this operation only changes T to U
julia> transcribe(seq)
rna:CGAUCGAUCGAUCCCGGGAAAUUU

read/write a single fastq/fasta file

using OpenGene

istream = fastq_open("input.fastq.gz")
ostream = fastq_open("output.fastq.gz","w")

# fastq_read will return an object FastqRead {name, sequence, strand, quality}
# fastq_write can write a FastqRead into a ouput stream
while (fq = fastq_read(istream))!=false
    fastq_write(ostream, fq)
end

close(ostream)

fasta is supported similarly with fasta_open, fasta_read and fasta_write

read/write a pair of fastq files

using OpenGene

istream = fastq_open_pair("R1.fastq.gz", "R2.fastq.gz")
ostream = fastq_open_pair("Out.R1.fastq.gz","Out.R2.fastq.gz","w")

# fastq_read_pair will return a pair of FastqRead {read1, read2}
# fastq_write_pair can write this pair to two files
while (pair = fastq_read_pair(istream))!=false
    fastq_write_pair(ostream, pair)
end

close(ostream)

read/write a bed file

using OpenGene

# read all records, return an array of Intervals(chrom, chromstart, chromend)
intervals = bed_read_intervals("in.bed")
# write all records
bed_write_intervals("out.bed",intervals)

read/write a VCF

using OpenGene

# load the entire VCF data into a vcf object, which has a .header field and a .data field
vcfobj = vcf_read("in.vcf")
# write the vcf object into a file
vcf_write("out.vcf", vcfobj)

VCF Operations

using OpenGene

v1 = vcf_read("v1.vcf")
v2 = vcf_read("v2.vcf")

# merge by positions
v_merge = v1 + v2

# intersect by positions
v_intersect = v1 * v2

# remove v2 records from v1, by positions
v_minus = v1 - v2

read/write a GTF

using OpenGene

# load the gtf header and data
gtfobj = gtf_read("in.gtf")

# write the gtf object into a file
gtf_write("out.gtf", gtfobj)

# if the file is too big, use following to load header only
gtfobj, stream = gtf_read("in.gtf", loaddata = false)
while (row = gtf_read_row(stream)) != false
    # do something with row ...
end

locate the gene/exon/intron

using OpenGene, OpenGene.Reference

# load the gencode dataset, it will download a file from gencode website if it's not downloaded before
# once it's loaded, it will be cached so future loads will be fast
index = gencode_load("GRCh37")

# locate which gene chr:pos is in
gencode_locate(index, "chr5", 149526621)
# it will return
# 1-element Array{Any,1}:
#  Dict{ASCIIString,Any}("gene"=>"PDGFRB","number"=>1,"transcript"=>"ENST00000261799.4","type"=>"intron")
genes = gencode_genes(index, "TP53")
# return an array with only one record
genes[1].name, genes[1].chr, genes[1].start_pos, genes[1].end_pos
# ("TP53","chr17",7565097,7590856)

access assembly (hg19/hg38)

julia> using OpenGene

julia> using OpenGene.Reference

julia> hg19 = load_assembly("hg19")
# Dict{ASCIIString,OpenGene.FastaRead} with 93 entries:

julia> hg19["chr17"]
# >chr17
# dna:AAGCTTCTCACCCTGTTCCTGCATAGATAATTGCATGACA......agggtgtgggtgtgggtgtgggtgtgggtgtggtgtgtgggtgtgggtgtgGT

julia> hg19["chr17"].sequence[1:100]
# dna:AAGCTTCTCACCCTGTTCCTGCATAGATAATTGCATGACAATTGCCTTGTCCCTGCTGAATGTGCTCTGGGGTCTCTGGGGTCTCACCCACGACCAACTC

merge a pair of reads from pair-end sequencing

julia> using OpenGene, OpenGene.Algorithm

julia> r1=dna("TTTAGGCCTGTCACTGTGAACGCTATCAGCAAGCCTTTGCATGATTTTTCTCTTTCCCACTCCTACATTCTCGGTGATGACAACAACTGTAGCCTGATCCAGATATTTCGAAGTGCAACAAATCGTATTCAATATAGAGTAAGG")
dna:TTTAGGCCTGTCACTGTGAACGCTATCAGCAAGCCTTTGCATGATTTTTCTCTTTCCCACTCCTACATTCTCGGTGATGACAACAACTGTAGCCTGATCCAGATATTTCGAAGTGCAACAAATCGTATTCAATATAGAGTAAGG

julia> r2=dna("GTTAGCTATTACTGTAATCACCGCGAGACAAGTTAATGAGAGAGTTATTCATAAAACTTACTCTATATTGAATACGATTTGTAGCACATCGAAATATCTGGATCAGGCTACAGTTGTAGTCATCACCGAGAATGTAGGAGTGG")
dna:GTTAGCTATTACTGTAATCACCGCGAGACAAGTTAATGAGAGAGTTATTCATAAAACTTACTCTATATTGAATACGATTTGTAGCACATCGAAATATCTGGATCAGGCTACAGTTGTAGTCATCACCGAGAATGTAGGAGTGG

julia> offset, overlap_len, distance = overlap(r1, r2)
(56,88,4)

julia> merged = simple_merge(r1, r2, overlap_len)
dna:TTTAGGCCTGTCACTGTGAACGCTATCAGCAAGCCTTTGCATGATTTTTCTCTTTCCCACTCCTACATTCTCGGTGATGACAACAACTGTAGCCTGATCCAGATATTTCGAAGTGCAACAAATCGTATTCAATATAGAGTAAGGTTTATGAATAACTCTCTCATTAACTTGTCTCGCGGTGATTACAGTAATAGCTAAC

opengene.jl's People

Contributors

sfchen avatar staticfloat avatar tkelman avatar zhmz90 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opengene.jl's Issues

Throwing multiple errors while testing OpenGene package in Julia

loaded Package in Julia Version 1.0.4:
(v1.0) pkg> add https://github.com/OpenGene/OpenGene.jl
and it throws multiple errors while:
(v1.0) pkg> test OpenGene
ERROR: LoadError: LoadError: LoadError: LoadError: syntax: extra token "Sequence" after end of expression
Stacktrace:
[1] include at ./boot.jl:317 [inlined]
[2] include_relative(::Module, ::String) at ./loading.jl:1044
[3] include at ./sysimg.jl:29 [inlined]
[4] include(::String) at /home/jagadeesh/.julia/packages/OpenGene/J7ubl/src/OpenGene.jl:1
[5] top-level scope at none:0
[6] include at ./boot.jl:317 [inlined]
[7] include_relative(::Module, ::String) at ./loading.jl:1044
[8] include at ./sysimg.jl:29 [inlined]
[9] include(::String) at /home/jagadeesh/.julia/packages/OpenGene/J7ubl/src/OpenGene.jl:1
[10] top-level scope at none:0
[11] include at ./boot.jl:317 [inlined]
[12] include_relative(::Module, ::String) at ./loading.jl:1044
[13] include at ./sysimg.jl:29 [inlined]
[14] include(::String) at /home/jagadeesh/.julia/packages/OpenGene/J7ubl/src/OpenGene.jl:1
[15] top-level scope at none:0
[16] include at ./boot.jl:317 [inlined]
[17] include_relative(::Module, ::String) at ./loading.jl:1044
[18] include(::Module, ::String) at ./sysimg.jl:29
[19] top-level scope at none:2
[20] eval at ./boot.jl:319 [inlined]
[21] eval(::Expr) at ./client.jl:393
[22] top-level scope at ./none:3
in expression starting at /home/jagadeesh/.julia/packages/OpenGene/J7ubl/src/Base/Types/sequence.jl:10
in expression starting at /home/jagadeesh/.julia/packages/OpenGene/J7ubl/src/Base/Types/Types.jl:74
in expression starting at /home/jagadeesh/.julia/packages/OpenGene/J7ubl/src/Base/Base.jl:5
in expression starting at /home/jagadeesh/.julia/packages/OpenGene/J7ubl/src/OpenGene.jl:15
ERROR: LoadError: Failed to precompile OpenGene [647f90c1-ef53-5e39-9970-8147cb8e69f9] to /home/jagadeesh/.julia/compiled/v1.0/OpenGene/qsi3c.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1203
[3] _require(::Base.PkgId) at ./loading.jl:960
[4] require(::Base.PkgId) at ./loading.jl:858
[5] require(::Module, ::Symbol) at ./loading.jl:853
[6] include at ./boot.jl:317 [inlined]
[7] include_relative(::Module, ::String) at ./loading.jl:1044
[8] include(::Module, ::String) at ./sysimg.jl:29
[9] include(::String) at ./client.jl:392
[10] top-level scope at none:0
in expression starting at /home/jagadeesh/.julia/packages/OpenGene/J7ubl/test/runtests.jl:1
ERROR: Package OpenGene errored during testing

help to solve this error and thanks in advance

Consider contributing to BioJulia

Hi, OpenGene!

OpenGene is heading towards covering much of the same territory as BioJulia. Rather than create a new package, would you consider contributing to BioJulia? We'd be very glad to have you, and it would avoid duplication of effort.

install "OpenGeen"

Hello,

I'm a beginner, and I'm starting to analyse NGS data (RNA-Ses, Exomes, Chip-Seq,...).
Unfortunately, I can't install Julia's "OpenGene" package in my MacBook Pro.

I would like to know if there is another package or a particular procedure.
I thank you, and congratulate the whole community for the development of Julia.

Kind regards

homeo

SHA.sha1 Error

julia> using OpenGene

julia> using OpenGene.Reference

julia> hg19 = load_assembly("hg19")
INFO: checking SHA1...
ERROR: MethodError: lowercase has no method matching lowercase(::Array{UInt8,1})
in check_sha1 at /Users/fb/.julia/v0.4/OpenGene/src/Reference/Genome/downloader.jl:7
in download_assembly at /Users/fb/.julia/v0.4/OpenGene/src/Reference/Genome/downloader.jl:51
in load_assembly at /Users/fb/.julia/v0.4/OpenGene/src/Reference/Genome/reader.jl:11

Info about upcoming removal of packages in the General registry

As described in https://discourse.julialang.org/t/ann-plans-for-removing-packages-that-do-not-yet-support-1-0-from-the-general-registry/ we are planning on removing packages that do not support 1.0 from the General registry. This package has been detected to not support 1.0 and is thus slated to be removed. The removal of packages from the registry will happen approximately a month after this issue is open.

To transition to the new Pkg system using Project.toml, see https://github.com/JuliaRegistries/Registrator.jl#transitioning-from-require-to-projecttoml.
To then tag a new version of the package, see https://github.com/JuliaRegistries/Registrator.jl#via-the-github-app.

If you believe this package has erroneously been detected as not supporting 1.0 or have any other questions, don't hesitate to discuss it here or in the thread linked at the top of this post.

stuck on julia v0.6

‘’‘
ERROR: LoadError: LoadError: UndefVarError: readall not defined
Stacktrace:
[1] include_from_node1(::String) at .\loading.jl:569
[2] include(::String) at .\sysimg.jl:14
[3] include_from_node1(::String) at .\loading.jl:569
[4] eval(::Module, ::Any) at .\boot.jl:235
[5] _require(::Symbol) at .\loading.jl:483
[6] require(::Symbol) at .\loading.jl:398
while loading C:\Users\x'x'x.julia\v0.6\OpenGene\src\compat.jl, in expression starting on line 2
while loading C:\Users\x'x'x.julia\v0.6\OpenGene\src\OpenGene.jl, in expression starting on line 11
’‘’

Unable to add OpenGene in julia 1.1.1

Dear All.
Thanks for your useful tools, Now I tried to Pkg.add OpenGene in julia 1.1.1, while the terminal report following errors:
Updating registry at ~/.julia/registries/General
Updating git-repo https://github.com/JuliaRegistries/General.git
Resolving package versions...
ERROR: Unsatisfiable requirements detected for package OpenGene [63bcc5cb]:
OpenGene [63bcc5cb] log:
├─possible versions are: 0.1.0-0.1.11 or uninstalled
├─restricted to versions * by an explicit requirement, leaving only versions 0.1.0-0.1.11
└─restricted by julia compatibility requirements to versions: uninstalled — no versions left
Stacktrace:
[1] #propagate_constraints!#61(::Bool, ::Function, ::Pkg.GraphType.Graph, ::Set{Int64}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/GraphType.jl:1007
[2] propagate_constraints! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/GraphType.jl:948 [inlined]
[3] #simplify_graph!#121(::Bool, ::Function, ::Pkg.GraphType.Graph, ::Set{Int64}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/GraphType.jl:1462
[4] simplify_graph! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/GraphType.jl:1462 [inlined]
[5] resolve_versions!(::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}, ::Nothing) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:371
[6] resolve_versions! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:315 [inlined]
[7] #add_or_develop#63(::Array{Base.UUID,1}, ::Symbol, ::Function, ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:1172
[8] #add_or_develop at ./none:0 [inlined]
[9] #add_or_develop#17(::Symbol, ::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:59
[10] #add_or_develop at ./none:0 [inlined]
[11] #add_or_develop#16 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:36 [inlined]
[12] #add_or_develop at ./none:0 [inlined]
[13] #add_or_develop#13 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:34 [inlined]
[14] #add_or_develop at ./none:0 [inlined]
[15] #add_or_develop#12(::Base.Iterators.Pairs{Symbol,Symbol,Tuple{Symbol},NamedTuple{(:mode,),Tuple{Symbol}}}, ::Function, ::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:33
[16] #add_or_develop at ./none:0 [inlined]
[17] #add#22 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:64 [inlined]
[18] add(::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/API.jl:64
[19] top-level scope at none:0
So, My question is if opengene can't use in julia which version higher than 0.6.
Best Regards

LoadError: UndefVarError: readall not defined

my julia version is 0.6.0
after type 'using OpenGene'
Stacktrace:
[1] include_from_node1(::String) at ./loading.jl:569
[2] include(::String) at ./sysimg.jl:14
[3] include_from_node1(::String) at ./loading.jl:569
[4] eval(::Module, ::Any) at ./boot.jl:235
[5] _require(::Symbol) at ./loading.jl:483
[6] require(::Symbol) at ./loading.jl:398
while loading /home/roy/juliapro/JuliaPro-0.6.0.1/JuliaPro/pkgs-0.6.0.1/v0.6/OpenGene/src/compat.jl, in expression starting on line 2
while loading /home/roy/juliapro/JuliaPro-0.6.0.1/JuliaPro/pkgs-0.6.0.1/v0.6/OpenGene/src/OpenGene.jl, in expression starting on line 11

It is possible that the readall had been replaced with readstring ?

Sam/bam IO

I will try to implement sam and bam file IO. Maybe need one week or two.

don't override display

You are overloading display in order to customize output. This is the wrong way to do it — see the manual on custom pretty printing.

To customize REPL output, you should override Base.show(io::IO, x::MyType) and/or Base.show(io::IO, ::MIME"text/plain", x::MyType), as explained in the manual.

Overriding display directly will break IJulia, Juno, and other environments that have a custom display mechanism.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.