grimmlab / permgwas Goto Github PK
View Code? Open in Web Editor NEWEfficient Permutation-based GWAS for Normal and Skewed Phenotypic Distributions
Home Page: https://doi.org/10.1093/bioinformatics/btac455
License: MIT License
Efficient Permutation-based GWAS for Normal and Skewed Phenotypic Distributions
Home Page: https://doi.org/10.1093/bioinformatics/btac455
License: MIT License
Line 1 in 3a55ddf
Shouldn't be FROM nvidia/cuda:11.1.1-base-ubuntu20.04
?
Hi.
I would like to confirm whether permGWAS handles missing genotype data.
When I provide a PLINK .bed file with missing genotypes I get this error:
Traceback (most recent call last):
File "permGWAS.py", line 88, in <module>
X, y, K, covs, positions, chrom, X_index = prep.load_and_prepare_data(args)
File "/permGWAS/preprocessing/prepare_data.py", line 75, in load_and_prepare_data
X = load_files.load_genotype_matrix(arguments.x, sample_index=sample_index[1])
File "/permGWAS/preprocessing/load_files.py", line 140, in load_genotype_matrix
raise Exception('Genotype not in additive encoding.')
Exception: Genotype not in additive encoding.
Thanks,
Yaniv
Dear Grimm,
After a lot of try, I still can't run permGWAS. I can run it on test data but when it comes to real data, it gives errors at every step.
I have generated 100s of data format but almost all of them are failing to pass the program to run.
My data is huge and are available in plink formats.
But unlike your samples data, my sample_ID are names not numbers which I feel are making a trouble in analysis but it is quite impossible to change this for 100s of analyses.
Second, it is not clear from the manual if phenotype file must contains the same number of individuals in same sequence present in plink file or it is okay to have different numbers. What if some phenotype values are missing and are filled with 'NA' and how program treats such missing values. Because it is not possible to generate plink files only for those individuals where phenotypes are present if it is a requirement for program.
After trying all the things, very last error:
/data1# python3 permGWAS.py -x ./data/imputed_final_chr7Mod_js_GT3.bed -y ./data/ind5.pheno
GPU is available. Perform computations on device cuda:0
Checked if all specified files exist. Start loading data.
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/groupby.py", line 1490, in array_func
result = self.grouper._cython_operation(
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/ops.py", line 959, in _cython_operation
return cy_op.cython_operation(
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/ops.py", line 657, in cython_operation
return self._cython_op_ndim_compat(
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/ops.py", line 497, in _cython_op_ndim_compat
return self._call_cython_op(
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/ops.py", line 541, in _call_cython_op
func = self._get_cython_function(self.kind, self.how, values.dtype, is_numeric)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/ops.py", line 173, in _get_cython_function
raise NotImplementedError(
NotImplementedError: function is not implemented for this dtype: [how->mean,dtype->object]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/pandas/core/nanops.py", line 1692, in _ensure_numeric
x = float(x)
ValueError: could not convert string to float: 'Alme_22'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/pandas/core/nanops.py", line 1696, in _ensure_numeric
x = complex(x)
ValueError: could not convert string to complex: 'Alme_22'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "permGWAS.py", line 88, in <module>
X, y, K, covs, positions, chrom, X_index = prep.load_and_prepare_data(args)
File "/data1/preprocessing/prepare_data.py", line 65, in load_and_prepare_data
y = load_files.load_phenotype(arguments)
File "/data1/preprocessing/load_files.py", line 206, in load_phenotype
y = y.sort_values(y.columns[0]).groupby(y.columns[0]).mean()
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/groupby.py", line 1855, in mean
result = self._cython_agg_general(
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/groupby.py", line 1507, in _cython_agg_general
new_mgr = data.grouped_reduce(array_func)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/internals/managers.py", line 1503, in grouped_reduce
applied = sb.apply(func)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/internals/blocks.py", line 329, in apply
result = func(self.values, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/groupby.py", line 1503, in array_func
result = self._agg_py_fallback(values, ndim=data.ndim, alt=alt)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/groupby.py", line 1457, in _agg_py_fallback
res_values = self.grouper.agg_series(ser, alt, preserve_dtype=True)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/ops.py", line 994, in agg_series
result = self._aggregate_series_pure_python(obj, func)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/ops.py", line 1015, in _aggregate_series_pure_python
res = func(group)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/groupby/groupby.py", line 1857, in <lambda>
alt=lambda x: Series(x).mean(numeric_only=numeric_only),
File "/usr/local/lib/python3.8/dist-packages/pandas/core/generic.py", line 11556, in mean
return NDFrame.mean(self, axis, skipna, numeric_only, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/generic.py", line 11201, in mean
return self._stat_function(
File "/usr/local/lib/python3.8/dist-packages/pandas/core/generic.py", line 11158, in _stat_function
return self._reduce(
File "/usr/local/lib/python3.8/dist-packages/pandas/core/series.py", line 4666, in _reduce
return op(delegate, skipna=skipna, **kwds)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/nanops.py", line 96, in _f
return f(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/nanops.py", line 158, in f
result = alt(values, axis=axis, skipna=skipna, **kwds)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/nanops.py", line 421, in new_func
result = func(values, axis=axis, skipna=skipna, mask=mask, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pandas/core/nanops.py", line 727, in nanmean
the_sum = _ensure_numeric(values.sum(axis, dtype=dtype_sum))
File "/usr/local/lib/python3.8/dist-packages/pandas/core/nanops.py", line 1699, in _ensure_numeric
raise TypeError(f"Could not convert {x} to numeric") from err
TypeError: Could not convert Alme_22 to numeric
Even after making all the IID numerical, facing error:
root@58c5c4cccb7c:/data1# python3 permGWAS.py -x ./data/imputed_final_chr7Mod_js_GT3.fam -y ./data/ind6.pheno
GPU is available. Perform computations on device cuda:0
Checked if all specified files exist. Start loading data.
Samples of genotype and phenotype do not match.
Attaching sample data if you want to test the data.
Please have a look into error and let me know if it can be solved without sample data.
Thanks,
Vinod
I use : python3 permGWAS.py -x ./data/mydata.map -y ./data/mypheno.pheno
The following error occurred:
Failure when running permGWAS2.0
could not convert string to float:'1\t13894\t0|t13894'
mypheno.pheno file is
FID IID phenotype_value
11797 11797 0.96590
9936 9936 0.83560
Hi. I am trying to run your software on my dataset. My genotype data is a binary plink file. Without any covariate, the software runs just fine. As soon as I add covariates, I get the title error message. I have checked extensively for any mismatch between the cov and pheno files, and found none. I even used R to arrange the IDs so they appear in the same order in both files, but this has not changed anything. I tried to convert the plink file to H5 (in case the double ID columns were causing the issue) but it didn't work. I do not know what else I can do. I am copying the head of both files down below. Any help would be appreciated. I will use the opportunity to ask how should I code missing data (empty, NAs, etc)? Also, if my phenotype is binary, how it should be coded? Plink asks for 2 for cases and 1 for controls. Does your software accepts that?
PHENO:
FID,aod_FR
10E1,1
10E3,1
10E4,1
10E6,1
10N1,1
10N3,1
10N3b,1
10S1,1
10S6,1
COV:
FID,RainfallAve_1km_91to20,Elevation_1haAve,Ca_Mg,SO2_SO4_1,NO2_NO3__1,DDEG,DBH
10E1,761.0971351,86.05000114,4.5,2.5,9.2,378.7459472,78.7
10E3,761.0971351,86.05000114,4.5,2.5,9.2,378.7459472,49.8
10E4,761.0971351,86.05000114,4.5,2.5,9.2,378.7459472,52.1
10E6,761.0971351,86.05000114,4.5,2.5,9.2,378.7459472,68.1
10N1,761.0971351,86.05000114,4.5,2.5,9.2,378.7459472,71
10N3,761.0971351,86.05000114,4.5,2.5,9.2,378.7459472,59
10N3b,761.0971351,86.05000114,4.5,2.5,9.2,378.7459472,59
10S1,761.0971351,86.05000114,4.5,2.5,9.2,378.7459472,88.8
10S6,761.0971351,86.05000114,4.5,2.5,9.2,378.7459472,65
Hi there, I managed to analyse the test data on my local machine using docker and singularity.
for singularity, unless I used sudo to run the container, there were not appropriate permissions to write the results files.
here's what I did:
`docker build -t permgwasimage .
docker run -it -v /home/mshenton/permGWAS:/home/permgwascontainer --name permgwascontainer permgwasimage
`
i can successfully run gwas
`
docker save permgwasimage -o localpermgwas2.tar
singularity build localpermgwas2.sif docker-archive://localpermgwas2.tar
singularity shell --bind /home/mshenton/permGWAS:/home/permgwascontainer localpermgwas2.sif
Done performing GWAS on phenotype phenotype_value for 194 samples and 2001 SNPs.
Elapsed time: 0.993591 s
Save results.
Failure when running permGWAS2.0
[Errno 13] Permission denied: '/home/permgwascontainer3/results/p_values_phenotype_value(3).csv'
`
`
sudo singularity shell --bind /home/mshenton/permGWAS:/home/permgwascontainer localpermgwas2.sif
`
With sudo, I can successfully run gwas. However, I'd like to use this on a server where I don't have sudo priveleges. I can't use Docker there either. Any suggestions?
Thanks for the tool!
bets regards
Matt
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.