Git Product home page Git Product logo

msnsam's Introduction

This version is the October 2007 version of the ms code with the added ability to include the number of samples (nsam) as a tbs argument. Please see Hudson's website for details on ms as well as installation instructions, but please email me for questions or bug reports, as bugs are likely mine and NOT part of the original code.

The primary motivation for this is to allow efficient simulation of datasets with unequal sampling across loci. Running this version of ms using sample size as a tbs argument appears to be much faster than running an independent ms run for each of many loci.

Please cite the following two papers if using this code (and I’d love an email just to know it’s being used!):

Hudson, R. (2002) Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics, 18, 337–338

Ross-Ibarra J, Wright SI, Foxe JP, Kawabe A, DeRose-Wilson L, et al. 2008. Patterns of Polymorphism and Demographic History in Natural Populations of Arabidopsis lyrata . PLoS ONE 3(6): e2411

Documentation

Uncompress the download, then run the installer script using "./clms" from within the msnsam directory (requires you have gcc installed). This should create three new files: msnsam, samplestats, and mean_std. The latter two are useful for calculating summary statitics and mean and sd from ms-style output (an example use of samplestats can be found here). For details on these two programs, please read the ms manual.

msnsam works exactly like the tbs feature of the normal ms program (as documented in the ms manual). An example tbsfile, example_tbs, is included in the msnsam directory. An example run using this file would be:

"./msnsam tbs 3 -t tbs -r tbs tbs -eN 0.25 0.05 < example_tbs"

which simulates a reduction in population size to 0.05*N which occured N generations in the past for 3 loci.

The example_tbs file reads:

10 5 3.9 1000
12 5.2 4.1 1200
16 4.6 5.6 1250

which, with the above command line, would simulate 3 loci of sample size 10, 12, and 16 (first column). The second column is the value of θ for each locus, the third column is the value of ρ, and the fourth column is the length of each locus in bp. 

msnsam's People

Contributors

rossibarra avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.