Git Product home page Git Product logo

msa2snp's Introduction

msa2snp

SNP and indel calls from multiple sequence alignment (MSA)

I need to call SNPs in a multiple sequence alignment from several sanger sequencing sequences, but did not find a simple software that can do that, so I just wrote one.

After you get the SNPs and their locations, then you can use another script mysnpeffv2.py to check SNP effects.

Input

The only input is a MSA file in fasta format. You can prepare the alignment in many software, such as MEGA and Aliview. Then just save them in fasta format. Some format requirement:

  1. The first sequence is the reference.

  2. Make sure the alignment has flat ends, that is, equal length for all sequences.

  3. Make sure no "-" in the beginning of the alignment.

Usage

# to create only a snp table with A T G C alleles
python msa2snp.py msa-example.fa > SNPs.txt

# to create vcf format output (this will also create a file "snptable.txt" for a snp table as above)
python msa2vcf.py msa-example.fa > snp.vcf

Update

  • 2022-11-11: add msa2vcf.py to create vcf format output

msa2snp's People

Contributors

pinbo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

msa2snp's Issues

Can I convert msa2snp output to vcf file?

Hi, Iโ€™m Jung and I used msa2snp to call SNP and InDel variants from CDS sequences of two orthologs in many comparisons.
msa2snp output is text format, but I need vcf file of the variants for further evolutional study.
Is there any way to convert msa2snp output to vcf file?

Thank you!

Best,
Jung

msa2vcf.py question

Hi,
Thanks for your code. That's help me a lot!
But there is a question when I use msa2vcf.py. The file has - at the beginning of ALT sequences resulting in the following format:

CN 1 . ctcg ,cacg . . . GT 0/0 0/0 0/0

I use MAFFT to make msa file.
Will the - at the beginning of ALT sequence influence the whole python script to decide which position is snp or indel?

I provide the *.msa file named final.cutmafft.txt
final.cutmafft.txt

Looking forward to your reply!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.