Git Product home page Git Product logo

wgsc's Introduction

WGS: Whole genome sequencing data analysis

Installation:

git clone https://github.com/YaoZhou89/WGSc.git
cd WGSc
make

Quality control of SNPs

The multiple polyploid levels and low coverage of depth may cause false positives in SNP calling due to misalignment and sequencing errors, so we applied a specific pipeline with six filters to eliminate low-confidence SNPs. The low-confidence SNPs were identified and removed by either technological or genetic filters. Technically, SNPs with low mapping quality, biased depth distribution or potential sequencing error were discarded. Genetically, SNPs that broke genetic laws, such as linkage disequilibrium or identity by descent (IBD), were discarded.

Quality filter

Usage:

java -jar WGS.jar --model vcf --type quality --MQ 30 --FS 60 --MQRankSum -12.5 --ReadPosRankSum --BSQRankSum 0 --SOR 3 --file in.vcf --out out.vcf

Parameters:

MQ: MQ tag in vcf file, if the value less than the threshold, the SNP will be eliminated.
FS: FS tag in vcf file, if the value larger than the threshold, the SNP will be eliminated.
MQRankSum: MQRankSum in vcf file, if the value less than the threshold, the SNP will be eliminated.
ReadPosRankSum: ReadPosRankSum tag,in vcf file, if the value less than the threshold, the SNP will be eliminated.
BSQRankSum: BSQRankSum tag in vcf file, if the value less than the threshold, the SNP will be eliminated.
SOR: SOR tag in vcf file, if the value larger than the threshold, the SNP will be eliminated.

Depth filter

Usage:

java -jar WGS.jar --model vcf --type DepthFilter --maxSD 3 --min-depth 10 --max-depth 50 --file in.vcf --out out.vcf

Parameters:

maxSD: standard deviation(SD) of the depth for all individuals, if the SD larger than the threshold, the SNP will be eliminated.
min-depth: minimum depth allowed for the total depth.
max-depth: maximum depth allowed for the total depth.

Tips: Manually determine the appropriate value for these parameters based on the distribution of total depth.

Segregation test filter

Usage:

java -jar WGS.jar --model vcf --type ST --file in.vcf.gz --out out.vcf

Tips: the input file should be compressed format and also indexed with tabix.

Linkage disequilibrium (LD) filter

Usage:

java -jar WGS.jar --model vcf --type LDfilter --windowSize 50 --threshold 0.01 --file in.vcf.gz --out out.vcf

Parameters:

windowSize: number of SNPs consider;
threshold: SNPs with mean correlation less than the threshold will be eliminated.

Tips: the input file should be compressed format and also indexed with tabix.

IBD filter

Usage:

java -jar WGS.jar --model vcf --type IBDfilter --anchorFile anchorSNPs.vcf --minComp 200 --maxIBDDist 0.03 --windowSize 2000 --numThreads 32 --file in.vcf --out outprefix

Parameters

anchorFile: file name of anchor SNPs, in vcf format
minComp: minimum number of comparable anchor sites
maxIBDDist: If the genetic distance within such a window was not larger than the given value, the 2 lines were considered to be in IBD. 
windowSize: the number of SNPs in a window 
numThreads: number of threads

Tips: Manually determine the maxIBDDist value based on the the mean distance across all pairs (suggested about 10 times smaller than the mean value using in the maize hapmap3)

Minor allele count filter

Usage: using vcftools

vcftools --vcf in.vcf --recode --mac 3 --out outprefix

Author

Dr. Yao Zhou ([email protected])

wgsc's People

Contributors

yaozhou89 avatar

Stargazers

 avatar zhangwenda avatar jinqiu wang avatar  avatar Lan avatar johnsonz avatar  avatar Bin Chen avatar LIU avatar slp avatar Snowseed avatar Henry avatar  avatar  avatar  avatar  avatar

Watchers

James Cloos avatar  avatar zhangwenda avatar

wgsc's Issues

WGCs make error

周老师您好,请问我在安装WGSc时遇到编译失败的情况,请问是怎么回事呢?

以下是编译日志:
make --no-print-directory ./bin/WGS
mkdir -p bin
g++ -std=c++11 -lz ./WGS/main.cpp -o ./bin/WGS
In file included from /share/bioinfo/Sh1ne/software/gcc-10.1.0/include/c++/10.1.0/ext/hash_map:60,
from ./WGS/HeaderIns.h:36,
from ./WGS/main.cpp:9:
/share/bioinfo/Sh1ne/software/gcc-10.1.0/include/c++/10.1.0/backward/backward_warning.h:32:2: warning: #warning This file includes at least one deprecated or antiquated header which may be removed without further notice at a future date. Please use a non-deprecated interface with equivalent functionality instead. For a listing of replacement headers and interfaces, consult the file backward_warning.h. To disable this warning use -Wno-deprecated. [-Wcpp]
32 | #warning
| ^~~~~~~
/tmp/ccvzpL05.o: In function std::__detail::_AnyMatcher<std::__cxx11::regex_traits<char>, false, false, false>::operator()(char) const': main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEc]+0x14): relocation truncated to fit: R_X86_64_PC32 against symbol guard variable for std::__detail::_AnyMatcher<std::__cxx11::regex_traits, false, false, false>::operator()(char) const::__nul' defined in .bss._ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEcE5__nul[_ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEcE5__nul] section in /tmp/ccvzpL05.o
main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEc]+0x47): relocation truncated to fit: R_X86_64_PC32 against symbol std::__detail::_AnyMatcher<std::__cxx11::regex_traits<char>, false, false, false>::operator()(char) const::__nul' defined in .bss._ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEcE5__nul[_ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEcE5__nul] section in /tmp/ccvzpL05.o main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEc]+0x6a): relocation truncated to fit: R_X86_64_PC32 against symbol std::__detail::_AnyMatcher<std::__cxx11::regex_traits, false, false, false>::operator()(char) const::__nul' defined in .bss._ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEcE5__nul[_ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEcE5__nul] section in /tmp/ccvzpL05.o
/tmp/ccvzpL05.o: In function std::__detail::_AnyMatcher<std::__cxx11::regex_traits<char>, false, false, true>::operator()(char) const': main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEc]+0x14): relocation truncated to fit: R_X86_64_PC32 against symbol guard variable for std::__detail::_AnyMatcher<std::__cxx11::regex_traits, false, false, true>::operator()(char) const::__nul' defined in .bss._ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEcE5__nul[_ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEcE5__nul] section in /tmp/ccvzpL05.o
main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEc]+0x47): relocation truncated to fit: R_X86_64_PC32 against symbol std::__detail::_AnyMatcher<std::__cxx11::regex_traits<char>, false, false, true>::operator()(char) const::__nul' defined in .bss._ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEcE5__nul[_ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEcE5__nul] section in /tmp/ccvzpL05.o main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEc]+0x6a): relocation truncated to fit: R_X86_64_PC32 against symbol std::__detail::_AnyMatcher<std::__cxx11::regex_traits, false, false, true>::operator()(char) const::__nul' defined in .bss._ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEcE5__nul[_ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEcE5__nul] section in /tmp/ccvzpL05.o
/tmp/ccvzpL05.o: In function std::__detail::_AnyMatcher<std::__cxx11::regex_traits<char>, false, true, false>::operator()(char) const': main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEc]+0x17): relocation truncated to fit: R_X86_64_PC32 against symbol guard variable for std::__detail::_AnyMatcher<std::__cxx11::regex_traits, false, true, false>::operator()(char) const::__nul' defined in .bss._ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEcE5__nul[_ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEcE5__nul] section in /tmp/ccvzpL05.o
main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEc]+0x50): relocation truncated to fit: R_X86_64_PC32 against symbol std::__detail::_AnyMatcher<std::__cxx11::regex_traits<char>, false, true, false>::operator()(char) const::__nul' defined in .bss._ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEcE5__nul[_ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEcE5__nul] section in /tmp/ccvzpL05.o main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEc]+0x73): relocation truncated to fit: R_X86_64_PC32 against symbol std::__detail::_AnyMatcher<std::__cxx11::regex_traits, false, true, false>::operator()(char) const::__nul' defined in .bss._ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEcE5__nul[_ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEcE5__nul] section in /tmp/ccvzpL05.o
/tmp/ccvzpL05.o: In function std::__detail::_AnyMatcher<std::__cxx11::regex_traits<char>, false, true, true>::operator()(char) const': main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb1EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb1EEclEc]+0x17): relocation truncated to fit: R_X86_64_PC32 against symbol guard variable for std::__detail::_AnyMatcher<std::__cxx11::regex_traits, false, true, true>::operator()(char) const::__nul' defined in .bss._ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb1EEclEcE5__nul[_ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb1EEclEcE5__nul] section in /tmp/ccvzpL05.o
main.cpp:(.text._ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb1EEclEc[_ZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb1EEclEc]+0x50): additional relocation overflows omitted from the output
collect2: error: ld returned 1 exit status
make[1]: *** [bin/WGS] Error 1
make: *** [all] Error 2

--type cleanSVs

您好,在您的泛基因文章中,SV需要经过cleanSVs处理,这一步具体是做什么呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.