Git Product home page Git Product logo

Comments (13)

qadrikazmi avatar qadrikazmi commented on August 13, 2024 1

yes @tshalev … the issue was that it accepts space delimeted phenotype (.fam) file only.

from bayesr.

tshalev avatar tshalev commented on August 13, 2024 1

Ah yes, sorry it's been a while. Yes I meant the .fam file. For plink, I generated the file using
plink --vcf file.vcf --allow-extra-chr --double-id --vcf-half-call m.
I then imported into R and had it enter the phenotype data, and afterwards had no problems running BayesR.

Did you use the flag --make-bed to make the plink files?

I did not. I converted my vcf file directly to plink binary format( bed, bim, fam) using the plink --vcf call. I then added phenotype information to the fam file in R (i.e., writing to the sixth column of the fam file from a different file with my phenoytpe data). After that it runs fine.

from bayesr.

syntheke avatar syntheke commented on August 13, 2024

BayesR only supports plink binary BED file format and handles missing genotype information. To make it work for your data you need to convert to best guess genotypes. Modifying bayesR to handle dosage information is straightforward by modifying the routines for loading and scaling the genotype information.

from bayesr.

tshalev avatar tshalev commented on August 13, 2024

I'm not entirely certain I understand. The main issue I can see is that the example data has 0, 1 and 2 in the fifth and sixth columns of the .bim file, whereas PLINK normally outputs nucleotide letter codes in those columns. I'm actually not sure how 0 1 and 2 make sense, since this information references specific SNPs, and not the genotype of an individual at a SNP (unless I am misunderstanding something).

I suppose a straightforward question would be: How do I get to having the .bed .bim and .fam files looking exactly like those in the example data, from a .vcf file (the standard for storing SNP data)?

Sorry for the trouble and thanks again,
Tal

from bayesr.

syntheke avatar syntheke commented on August 13, 2024

The software does not use any information in the bim-file, it only counts the number of rows to know the number of SNPs in the bed-file. Googling "vcf to plink" shows quite a few ways how to do this. However, if your vcf file does not contain genotypes it will not work.

from bayesr.

tshalev avatar tshalev commented on August 13, 2024

OK I figured out the issue. Recoding the plink file after converting to VCF fixed the problem. Thanks!

from bayesr.

qadrikazmi avatar qadrikazmi commented on August 13, 2024

hi @tshalev ,
can you please share the command of plink which you used?
I am getting the same issue but not able to solve this even after recoding.

from bayesr.

tshalev avatar tshalev commented on August 13, 2024

@qadrikazmi, kind of embarrassed to say but I think the problem was that I was entering the phenotype data into the .bim file in Excel :p. When I added it using scripts the problem went away.

from bayesr.

tshalev avatar tshalev commented on August 13, 2024

Ah yes, sorry it's been a while. Yes I meant the .fam file. For plink, I generated the file using
plink --vcf file.vcf --allow-extra-chr --double-id --vcf-half-call m.
I then imported into R and had it enter the phenotype data, and afterwards had no problems running BayesR.

from bayesr.

GabrieleNocchi avatar GabrieleNocchi commented on August 13, 2024

I am having the same issue and I am struggling to make it work.

I make my BED files using the function --make-bed in plink. I also need to add the phenotypes to my .fam file following conversion to plink from VCF. I have tried adding the phenotypes both programatically and manually but I still get the following error:

At line 572 of file baymods.f90
Fortran runtime error: End of file

Instead the simulated example data work fine. I am struggling to understand why.

from bayesr.

GabrieleNocchi avatar GabrieleNocchi commented on August 13, 2024

Ah yes, sorry it's been a while. Yes I meant the .fam file. For plink, I generated the file using
plink --vcf file.vcf --allow-extra-chr --double-id --vcf-half-call m.
I then imported into R and had it enter the phenotype data, and afterwards had no problems running BayesR.

Did you use the flag --make-bed to make the plink files?

from bayesr.

GabrieleNocchi avatar GabrieleNocchi commented on August 13, 2024

Ok, I got it. I spent quite a bit of time on this so let me just write in case somebody else encounter the same issue, as for me it was not too clear from the above comments:

So, I thought my problem was similar to tshalev, as I originally added my phenotypes in the .fam file manually and also programatically using sed, simply replacing the plink default assigned -9, which stands for no phenotype, with my phenotype.

At the end the issue was simply that the .fam file needs to be a SPACE separated file, not a TAB separated file, It is a bit strange because plink seems to generate the .fam tab spaced so you have to edit it and change those tabs into spaces to make it work.

Maybe it was not the case in older plink versions.

from bayesr.

GabrieleNocchi avatar GabrieleNocchi commented on August 13, 2024

Ah yes, sorry it's been a while. Yes I meant the .fam file. For plink, I generated the file using
plink --vcf file.vcf --allow-extra-chr --double-id --vcf-half-call m.
I then imported into R and had it enter the phenotype data, and afterwards had no problems running BayesR.

Did you use the flag --make-bed to make the plink files?

I did not. I converted my vcf file directly to plink binary format( bed, bim, fam) using the plink --vcf call. I then added phenotype information to the fam file in R (i.e., writing to the sixth column of the fam file from a different file with my phenoytpe data). After that it runs fine.

Many thanks fro your reply tshalev. I actually sorted my issue after I wrote my meesage. I replaced the tabs in the .fam files with spaces and it accepted it.

from bayesr.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.