Git Product home page Git Product logo

pcntoolkit's Introduction

Predictive Clinical Neuroscience Toolkit

Gitter Documentation Status DOI

Predictive Clinical Neuroscience software toolkit (formerly nispat).

Methods for normative modelling, spatial statistics and pattern recognition. Documentation, including tutorials can be found on readthedocs. Click on the docs button above to visit the site.

Basic installation (on a local machine)

i) install anaconda3 ii) create enviornment with "conda create --name <env_name>" iii) activate environment by "source activate <env_name>" iv) install required conda packages

conda install pip pandas scipy

v) install PCNtoolkit (plus dependencies)

pip install pcntoolkit

Alternative installation (on a shared resource)

Make sure conda is available on the system. Otherwise install it first from https://www.anaconda.com/

conda --version

Create a conda environment in a shared location

conda create -y python==3.8.3 numpy mkl blas --prefix=/shared/conda/<env_name>

Activate the conda environment

conda activate /shared/conda/<env_name>

Install other dependencies

conda install -y pandas scipy 

Install pip dependencies

pip --no-cache-dir install nibabel scikit-learn torch glob3 

Clone the repo

git clone https://github.com/amarquand/PCNtoolkit.git

install in the conda environment

cd PCNtoolkit/
python3 setup.py install

Test

python -c "import pcntoolkit as pk;print(pk.__file__)"

Quickstart usage

For normative modelling, functionality is handled by the normative.py script, which can be run from the command line, e.g.

# python normative.py -c /path/to/training/covariates -t /path/to/test/covariates -r /path/to/test/response/variables /path/to/my/training/response/variables

For more information, please see the following resources:

pcntoolkit's People

Contributors

amarquand avatar augub avatar bbuckova avatar charfraza avatar hesterhuijsdens avatar likeajumprope avatar lindenmp avatar maartenmennes avatar pbarkema avatar pierreberthet avatar sabryr avatar saigerutherford avatar smkia avatar thomaswolfers avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pcntoolkit's Issues

Predict function: UnboundLocalError: local variable 'nm' referenced before assignment

I train a normative model using the following function (2 folds just for testing purposes):

estimate(respfile=os.path.join(input_dir,"train_resp.txt"),
covfile=os.path.join(input_dir,"train_cov.txt"),
testresp=os.path.join(input_dir,"test_resp.txt"),
testcov=os.path.join(input_dir,"test_cov.txt"),
cvfolds=2,
alg="blr",
optimizer = "powell",
outputsuffix= "_2fold",
saveoutput=True,
savemodel=True,
standardize = True)

Then I want to apply the model to the test data:

predict(covfile=os.path.join(input_dir,"test_cov.txt"),
respfile=os.path.join(input_dir,"test_resp.txt"),
model_path=os.path.join(proc_dir,"Models"),
alg="blr",
outputsuffix="_test",
return_y=True )

I get the following error:

UnboundLocalError Traceback (most recent call last)
Cell In[36], line 1
----> 1 predict(covfile=os.path.join(input_dir,"test_cov_dm.txt"),
2 respfile=os.path.join(input_dir,"test_resp.txt"),
3 model_path=os.path.join(proc_dir,"Models"),
4 alg="blr",
5 outputsuffix="_test",
6 return_y=True
7 )
File ~/anaconda3/lib/python3.8/site-packages/pcntoolkit/normative.py:799, in predict(covfile, respfile, maskfile, **kwargs)
796 Y = Y[:, np.newaxis]
798 # warp the targets?
--> 799 if alg == 'blr' and nm.blr.warp is not None:
800 warp = True
801 Yw = np.zeros_like(Y)

UnboundLocalError: local variable 'nm' referenced before assignment

Any ideas?

Thanks!

Fix Module Index on readthedocs

modindex.rst is not showing up properly on readthedocs. None of the classes or functions show and they also don't appear in the page search.

The HTML file seems properly built by sphinx. I can load the HTML on my local laptop and view the classes/functions + search for them. So I'm not sure why it is not showing up on readthedocs.

I tried editing conf.py to add additional paths, moving modindex.rst into doc/source/pages/ (and editing index.rst toc to show the new path/depth).

None of this fixed the issue.

Add items to glossary page

The glossary page of our documentation is empty. We should format this page with a table to describe anything related to the PCNtoolkit. Table should include the jargon name, full name, a description, and a link to a reference. The empty glossary file can be found in doc/source/pages/glossary.rst

Theano-PyMC version mismatch

I installed pcntookit via pip, when I excute 'import pcntoolkit as pcn', it returns following errors.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
The installed Theano(-PyMC) version (1.0.5) does not match the PyMC3 requirements.
For PyMC3 to work, Theano must be uninstalled and replaced with Theano-PyMC.
See https://github.com/pymc-devs/pymc3/wiki for installation instructions.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

I tried to uninstall Theano and re-install Theano-PyMC, but it did not work. looking forward to your help

Update `predict` to save to a save_path

Great toolkit! I had two requests for additions to normative.py, (1) add the ability to specify the save path and (2) add the ability to not save output to disk (for smaller models obviously).

(1)
To add the ability to specify save paths, if you could add the kwarg of save_path to the predict function. Should be simple to add kwargs for the function, and then update each call to save_results (lines linked below).

def predict(covfile, respfile, maskfile=None, **kwargs):

save_results(None, Yhat, S2, None, outputsuffix=outputsuffix)

save_results(respfile, Yhat, S2, maskvol, Z=Z,
outputsuffix=outputsuffix, results=results)

It would be a similar fix for estimate and transfer functions, which both use save_results without the option of save_path. estimate would need a few additional updates where meta_data pickle files are saved.

(2)
To add the ability to not save output, add kwarg saveoutput similar to what is used in estimate to predict

GPR-model predict problem

First, I use this ‘savemodel ’method to save the GPR model, for example,
pcn.normal.estimate (covfile = '. / data / covariable_normsample. TXT', respfile = '. / data / features_normsample. TXT', cvfolders = 2, alg = 'GPR', savemodel = true)

Then I used predict to test the new data set, and the following problems occurred:

783 nm = nm.load(os.path.join(modelpath, 'NM' + str(fold) + '_' +

784 str(m) + inputsuffix + '.pkl'))

785 if (alg!='hbr' or nm.configs['transferred']==False):

--> 786 yhat, s2 = nm. predict(Xz, **kwargs)

787 else:

788 tsbefile = kwargs.get('tsbefile')

ypeError: predict() missing 2 required positional arguments: 'X' and 'y'.

I checked the source code and found that the GPR model must pass XS, X and Y parameters to predict, but this code only passed in one parameter: yhat, S2 = nm Predict (XZ, * * kwarg), so I think there is a problem in this place. I hope it can be solved. Thanks

shape of the centiles

Hi,

We used the PCN-toolkit (v0.21, BLR with WarpSinArcsinh-warping and default settings (# of knots 5, order of spline 3) to model life-span right hemispheric cortical thickness for an in-house sample.

centiles_both_k5_p3_ct

The fit looks comparable to Rutherford et al. (eLife). Good. To assess the centiles we simulated a dataset for 'x' with a clear 'cylindre'-like pattern with age and sex as covariates. We then applied the PCN-toolkit (v0.21, BLR with WarpSinArcsinh-warping and default settings (# of knots 5, order of spline 3) and plotted the centiles following the published code. The centiles do not look correct.
centiles_both_k5_p3

I'm attaching the plot and the dataset in the zip-file (age and sex distributions for train (80%) and test (20%) were kept highly similar).
drive-download-20230413T164029Z-001.zip
What are we doing wrong? Thanks!

Instalation instructions

  1. There is a disparity between what the README.md says and what the requirements.txt file in the achieve download (https://github.com/amarquand/nispat/archive/v1.2.2.tar.gz). e,g, requirements.txt indicated tensorflow==0.12.1 as an requirement.
    is tensorflow==0.12.1 a runtime requirement or not a requirement ?
  2. If it is a requirement then it is too old to be installed with pip
    pip install tensorflow==0.12.1
    ERROR: Could not find a version that satisfies the requirement tensorflow==0.12.1 (from versions: 1.13.0rc1, 1.13.0rc2, 1.13.1, 1.13.2, 1.14.0rc0, 1.14.0rc1, 1.14.0, 1.15.0rc0, 1.15.0rc1, 1.15.0rc2, 1.15.0rc3, 1.15.0, 1.15.2, 2.0.0a0, 2.0.0b0, 2.0.0b1, 2.0.0rc0, 2.0.0rc1, 2.0.0rc2, 2.0.0, 2.0.1, 2.1.0rc0, 2.1.0rc1, 2.1.0rc2, 2.1.0, 2.2.0rc0, 2.2.0rc1, 2.2.0rc2)
    ERROR: No matching distribution found for tensorflow==0.12.1

Bug in metric evaluation for warped BLR

I believe there is an error in line 488 of normative.py. Currently reads:

mf = evaluate(Ytest[:, np.newaxis], Yhati, S2=S2i,
mY=np.std(yw), sY=np.mean(yw),
nlZ=nm.neg_log_lik, nm=nm, Xz_tr=Xz_tr,
alg=alg, metrics = metrics)

but in the evaluate function specifies mY as the mean, and sY as the standard deviation, and they're written opposite.

Error in running the predict function

Hi,
after having estimated a normative model I'm encountering an error in the predict function.
The error I get is the following:

Loading data ...
Prediction by model  1 of 998575
Error: No module named 'bayesreg'
Traceback (most recent call last):
File "fdopa_prediction.py", line 47, in <module>
    alg="blr")
File "/home/k19080168/norm_modelling/lib/python3.7/site-packages/pcntoolkit/normative.py", line 718, in predict
    str(i) + inputsuffix + '.pkl'))
File "/home/k19080168/norm_modelling/lib/python3.7/site-packages/pcntoolkit/normative_model/norm_base.py", line 56, in load
    nm = pickle.load(handle)
ModuleNotFoundError: No module named 'bayesreg'

I've checked also the imports and the bayesreg model should be imported correctly from the library. Is there something I'm missing?

BUG in production code: predict function Z scores

Hi, I have checked and rechecked with @LuciaMaccioni the last version of your code and after commit 0c4d0c7 we think you might have some bug in the predict() function in the file normative.py.

In particular you have that at lines 770-773:

Y = Y[:, m]
if meta_data:
    mY = mY[m]
    sY = sY[m]

which is subsetting the Y matrix on just the last model estimated. In addition the following if statement, the variables mY[m] and sY[m] are referenced to m while mY and sY are both lists of arrays, and the code throws an IndexError.

Since the Y is used to calculate the Z scores this could be an issue.

In addition on line 489 in the estimate() function, where you have

mY=np.std(yw), sY=np.mean(yw)

aren't those (at least from the names) inverted?

Errors: No module named 'nm_utils'

Dear PCNtoolkit developer,

Thank you for providing incredible tools!
I am following 'braincharts' tutorial.

When the following command applied in Colab,
from nm_utils import remove_bad_subjects, load_2d

the error comes up like below:
ModuleNotFoundError: No module named 'nm_utils'

Best regards,

Jeong

Add HBR tutorial

The Hierarchical Bayesian Regression tutorial (found in docs/source/pages/tutorial_HBR.rst is currently empty and needs some attention. Take a look at the other tutorials in the pages folder for inspiration. I made these tutorials in a Jupyter notebook and then converted them to rest files using jupyter nbconvert --to rest notebook.ipynb

no releases specified

Hi Andre,
It might be useful to make use of the release tagging feature in git/github. That way we could refer to a specific release for issues/fixes etc.
Maarten

-c arg unable to read .mat file created by CONN

I am attempting to create a normative model from MRI data which I have preprocessed in CONN. Although I have generated .nii files to use for , CONN does not seem to generate a file to use as . When I attempt to pass a .mat file I receive the "I don't know what to do with + filename" error.

How do I create a covariates file for my preprocessed data to create a normative model?

Thank you

Problem when plotting centiles - AttributeError: 'NoneType' object has no attribute 'get_n_params'`

Dear Developers,

thanks for putting together PCNtoolkit. It is a great tool!

Package version: 0.27

I am trying to implement BLR with my data based on the tutorials you provide (https://github.com/predictive-clinical-neuroscience/PCNtoolkit-demo/blob/main/tutorials/BLR_protocol/BLR_normativemodel_protocol.ipynb and https://github.com/predictive-clinical-neuroscience/PCNtoolkit-demo/blob/main/tutorials/BLR_protocol/transfer_pretrained_normative_models.ipynb).

My aim is to model a set of imaging derived phenotypes based on age and gender. Currently, I am stuck trying to plot the normative curves + centiles as described in the second tutorial (or here https://github.com/predictive-clinical-neuroscience/PCNtoolkit-demo/blob/main/tutorials/BLR_protocol/transfer_pretrained_normative_models.ipynb). The error I am getting is:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[26], line 35
     33 # get the warp and warp parameters
     34  W = nm.blr.warp
---> 35  warp_param = nm.blr.hyp[1:nm.blr.warp.get_n_params()+1] 
     37  # first, we warp predictions for the true data and compute evaluation metrics
     38  med_te = W.warp_predictions(np.squeeze(yhat_te), np.squeeze(s2_te), warp_param)[0]

AttributeError: 'NoneType' object has no attribute 'get_n_params'

It results form the code which tries to warp the predictions after modelling dummy data for plotting

    yhat, s2 = ptk.normative.predict(cov_file_dummy, 
                       alg = 'blr', 
                       respfile = None, 
                       model_path = os.path.join(roi_path,'Models'), 
                       binary=False,
                       outputsuffix = '_dummy',
                       saveoutput = True)
    
    # load the normative model
    with open(os.path.join(roi_path,'Models', 'NM_0_0_estimate.pkl'), 'rb') as handle:
        nm = pickle.load(handle) 

   # get the warp and warp parameters
    W = nm.blr.warp
    warp_param = nm.blr.hyp[1:nm.blr.warp.get_n_params()+1] 

The models are derived by

    ptk.normative.estimate(covfile = roi_train_covfile,
                        respfile = roi_train_respfile,
                        testcov = roi_test_covfile,
                        testresp = roi_test_respfile,
                        optimizer = "powell",
                        alg = "blr",
                        saveoutput = True,
                        standardize = True,
                        savemodel=True,
                        binary=True,
                        )

nm.blr.warp appears to be empty/NoneType. However nm.blr appears to be a proper BLR object: nm.blr -> <pcntoolkit.model.bayesreg.BLR at 0x7faf185de820>. Also other methods based on the warp attribute appear not to work.

AttributeError                            Traceback (most recent call last)
Cell In[24], line 38
     35 warp_param = nm.blr.hyp[1:len(nm.blr.hyp)+1] 
     37 # first, we warp predictions for the true data and compute evaluation metrics
---> 38 med_te = W.warp_predictions(np.squeeze(yhat_te), np.squeeze(s2_te), warp_param)[0]
     39 med_te = med_te[:, np.newaxis]
     40 print('metrics:', evaluate(y_te, med_te))

AttributeError: 'NoneType' object has no attribute 'warp_predictions'```

The code to estimate the model appears to work fine and I get output metrics like EV, MSLL, etc. The code is:
```    ptk.normative.estimate(covfile = roi_train_covfile,
                        respfile = roi_train_respfile,
                        testcov = roi_test_covfile,
                        testresp = roi_test_respfile,
                        optimizer = "powell",
                        alg = "blr",
                        saveoutput = True,
                        standardize = True,
                        savemodel=True,
                        binary=True,
                        )

You can find the Jupyter Notebook with the remaining analysis code here: https://drive.google.com/file/d/1hL6vekKTi_lEBqDzSNsYASVN9IoEfuy3/view?usp=share_link.

Do you have any ideas what might be the issue causing the missing warp? Of course I am happy to provide further details. Thanks in advance!

Best,
Marvin

Review function documentation

The documentation for all of the classes, functions, etc is now rendering (Function Docs/Module Index) but has some inconsistencies in the format because it has been written by different people. This needs to be reviewed and brought into a standard format.

NIfTI concatenation

Hi,

I have a very basic question and sorry if I've missed something very obvious:

If the response variables are in a NIfTI file, how is the multi-subject NIfTI created? Is a 4-D NIfTI used, where the time dimension indexes subjects?

Thanks,
Asad

New published version (0.24) bugs

Hi @amarquand,
I've tried the new version of the pcntoolkit you've published and I have found a bug in the HBR.
In particular, when I run it the code gives me a KeyError at line 453 of the normative.py script.

I had a look and from what I saw in your code there is a point where there is a y_like variable passed to one of the dictionaries but none of the declared dictionaries in the HBR have a key called y_like which causes the issue and crashes the estimation.

In addition to this I have a question about the number predict function of the normative.py script. In particular my question is this: if i run the predict function on a new cohort with an already estimated model, will the estimation not be based on just a subset of the training set? I ask this because the fold parameter is set to 0, if not provided, which corresponds to the first run of the CV, which should not contain all data in the training set. Am I wrong in assuming this?

ImportError: cannot import name 'asscalar' from 'numpy'

Hello! I've created a conda enviroment on Linux ubuntu 22.04 with Python 3.8.3. First my glob3 installation failed, so I did: sudo pip install glob2.

(gamlss) $ python -c "import pcntoolkit as pk;print(pk.__file__)"
Traceback (most recent call last):
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/model/gp.py", line 15, in <module>
    from pcntoolkit.util.utils import squared_dist
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/util/__init__.py", line 1, in <module>
    from . import utils
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/util/utils.py", line 17, in <module>
    import pymc3 as pm
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/__init__.py", line 41, in <module>
    from .distributions import *
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/distributions/__init__.py", line 15, in <module>
    from . import timeseries
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/distributions/timeseries.py", line 22, in <module>
    from pymc3.util import get_variable_name
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/util.py", line 21, in <module>
    from numpy import asscalar, ndarray
ImportError: cannot import name 'asscalar' from 'numpy' (/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/numpy/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/trendsurf.py", line 21, in <module>
    from pcntoolkit.model.bayesreg import BLR
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/model/__init__.py", line 2, in <module>
    from . import gp
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/model/gp.py", line 25, in <module>
    from util.utils import squared_dist
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/util/__init__.py", line 1, in <module>
    from . import utils
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/util/utils.py", line 17, in <module>
    import pymc3 as pm
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/__init__.py", line 41, in <module>
    from .distributions import *
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/distributions/__init__.py", line 15, in <module>
    from . import timeseries
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/distributions/timeseries.py", line 22, in <module>
    from pymc3.util import get_variable_name
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/util.py", line 21, in <module>
    from numpy import asscalar, ndarray
ImportError: cannot import name 'asscalar' from 'numpy' (/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/numpy/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/model/gp.py", line 15, in <module>
    from pcntoolkit.util.utils import squared_dist
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/util/__init__.py", line 1, in <module>
    from . import utils
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/util/utils.py", line 17, in <module>
    import pymc3 as pm
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/__init__.py", line 41, in <module>
    from .distributions import *
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/distributions/__init__.py", line 15, in <module>
    from . import timeseries
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/distributions/timeseries.py", line 22, in <module>
    from pymc3.util import get_variable_name
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/util.py", line 21, in <module>
    from numpy import asscalar, ndarray
ImportError: cannot import name 'asscalar' from 'numpy' (/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/numpy/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/__init__.py", line 1, in <module>
    from . import trendsurf
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/trendsurf.py", line 30, in <module>
    from model.bayesreg import BLR
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/model/__init__.py", line 2, in <module>
    from . import gp
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/model/gp.py", line 25, in <module>
    from util.utils import squared_dist
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/util/__init__.py", line 1, in <module>
    from . import utils
  File "/media/sda/Anna/PCNtoolkit/pcntoolkit/util/utils.py", line 17, in <module>
    import pymc3 as pm
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/__init__.py", line 41, in <module>
    from .distributions import *
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/distributions/__init__.py", line 15, in <module>
    from . import timeseries
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/distributions/timeseries.py", line 22, in <module>
    from pymc3.util import get_variable_name
  File "/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/pymc3-3.9.3-py3.8.egg/pymc3/util.py", line 21, in <module>
    from numpy import asscalar, ndarray
ImportError: cannot import name 'asscalar' from 'numpy' (/home/user/anaconda3/envs/gamlss/lib/python3.8/site-packages/numpy/__init__.py)

AttributeError: 'NormBLR' object has no attribute 'theta0'

I was testing the code in https://amarquand.github.io/PCNtoolkit/doc/build/html/pages/BLR_normativemodel_protocol.html.

And I found the errors shown as below:
Processing data in /normative modeling/ROI_models/lh_MeanThickness_thickness/resp_tr.txt
Estimating model 1 of 1
configuring BLR ( order 1 )
Using default hyperparameters
Traceback (most recent call last):
File "/home/.conda/envs/test/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3378, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
yhat_te, s2_te, nm, Z, metrics_te = estimate(cov_file_tr,
File "/home/.conda/envs/test/lib/python3.8/site-packages/pcntoolkit/normative.py", line 449, in estimate
nm = norm_init(Xz_tr, Yz_tr[:, i], alg=alg, **kwargs)
File "/home/.conda/envs/test/lib/python3.8/site-packages/pcntoolkit/normative_model/norm_utils.py", line 18, in norm_init
nm = NormBLR(X=X, y=y, theta=theta, **kwargs)
File "/home/.conda/envs/test/lib/python3.8/site-packages/pcntoolkit/normative_model/norm_blr.py", line 125, in init
self.theta = self.theta0

Can you help check with that?

split_nm function error?

Dear authors;

Firstly, let me thank you for the code and sharing!

I have been employing your code in the last days (basically just wrapping it in a docker) and I think there is a small mistake in the code.

The function split_nm inside normative_parallel.py shows the following at line 289:

respfile = pd.read_pickle(testrespfile_path)

but I guess it should be:

testrespfile = pd.read_pickle(testrespfile_path)

Otherwise, respfile and testrespfile are the same at the binary case, right?

Best!

Confusions about optimization outputs

While using the PCNtoolkit for a GPR regression, the scipy.optimize.fmin_cg() function printed results as below:
image
This optimal function takes "log p(y|X)" as input, so on my comprehension theoretically the "current function value" should be around zero; It seems like after hyperparameter-optimization process, the "Current function value" is still so high? Is that normal / correct? ; As I used the codes and demo data offered on GitHub (from https://github.com/saigerutherford/PCNtoolkit-demo), I couldn't figure out which part goes wrong and hope you could answer my confusions.

Problem with applying warps to data

Hi again,

I am facing a follow-up problem to a previous issue (#114) with regard to the warping process. As noted there, I have amended my code as recommended by adding "warp='WarpSinArcsinh', warp_reparam= True" to the estimate function to perform warped BLR.

My aim with this analysis is to get centile curves of WMH and related markers as well as respective z-scores for downstream predictive modelling in a sample from two cohorts.

Currently I am trying to reproduce the centile curves for WMH volume with warped BLR as presented in this manuscript (https://www.sciencedirect.com/science/article/pii/S1053811921009873#fig0004) as it accounts for potential non-gaussianity in the data. Therefore, I am following this tutorial "https://github.com/predictive-clinical-neuroscience/PCNtoolkit-demo/blob/main/tutorials/BLR_protocol/transfer_pretrained_normative_models.ipynb". The tutorials mentions that the predictions (yhat, S2) from the dummy model are after the prediction in the warped space and need to be inversely warped to achieve plotting in input space. Apparently, also the true data (y_te) is rescaled in the tutorial.

import pickle
from pcntoolkit.normative import evaluate
import matplotlib.pyplot as plt

import seaborn as sns
sns.set(style='whitegrid')

# random jitter function
def rand_jitter(arr):
    stdev = .005 * (max(arr) - min(arr))
    return arr + np.random.randn(len(arr)) * stdev

for idp_num, c in enumerate(feature_columns):

    if c in tracts_wo_wmh: continue
    print('Running IDP', idp_num, c, ':')
    roi_path = rois_path/c
    os.chdir(roi_path)
    
    # load the true data points
    X_te = np.loadtxt(os.path.join(roi_path, 'cov_bspline.txt'))
    yhat_te = np.loadtxt(os.path.join(roi_path, 'yhat_estimate.txt'))[:, np.newaxis]
    s2_te = np.loadtxt(os.path.join(roi_path, 'ys2_estimate.txt'))[:,np.newaxis]
    y_te = np.loadtxt(os.path.join(roi_path, f'resp_{c}.txt'))[:,np.newaxis]
            
    # set up the covariates for the dummy data
    print('Making predictions with dummy covariates (for visualisation)')
    yhat, s2 = ptk.normative.predict(cov_file_dummy, 
                       alg = 'blr', 
                       respfile = None, 
                       model_path = os.path.join(roi_path,'Models'), 
                       binary=False,
                       outputsuffix = '_dummy',
                       saveoutput = True,
                       warp="WarpSinArcsinh",
                       warp_reparam=True)
    
    # load the normative model
    with open(os.path.join(roi_path,'Models', 'NM_0_0_estimate.pkl'), 'rb') as handle:
        nm = pickle.load(handle) 

   # get the warp and warp parameters
    W = nm.blr.warp
    warp_param = nm.blr.hyp[1:nm.blr.warp.get_n_params()+1] 
        
    # first, we warp predictions for the true data and compute evaluation metrics
    med_te = W.warp_predictions(np.squeeze(yhat_te), np.squeeze(s2_te), warp_param)[0]
    med_te = med_te[:, np.newaxis]
    print('metrics:', evaluate(y_te, med_te))
    
    # then, we warp dummy predictions to create the plots
    med, pr_int = W.warp_predictions(np.squeeze(yhat), np.squeeze(s2), warp_param)
    
    # extract the different variance components to visualise
    beta, junk1, junk2 = nm.blr._parse_hyps(nm.blr.hyp, X_dummy)
    s2n = 1/beta # variation (aleatoric uncertainty)
    s2s = s2-s2n # modelling uncertainty (epistemic uncertainty)
    
    # plot the data points
    y_te_rescaled_all = np.zeros_like(y_te)
    for sid, site in enumerate(site_ids_te):
        # plot the true test data points 
        if all(elem in site_ids_tr for elem in site_ids_te):
            # all data in the test set are present in the training set
            
            # first, we select the data points belonging to this particular sex and site
            idx = np.where(np.bitwise_and(X_te[:,2] == sex, X_te[:,sid+len(cols_cov)+1] !=0))[0]
            if len(idx) == 0:
                print('No data for site', sid, site, 'skipping...')
                continue
            
            # then directly adjust the data
            idx_dummy = np.bitwise_and(X_dummy[:,1] > X_te[idx,1].min(), X_dummy[:,1] < X_te[idx,1].max())
            y_te_rescaled = y_te[idx] - np.median(y_te[idx]) + np.median(med[idx_dummy])
        else:
            # we need to adjust the data based on the adaptation dataset 
            
            # first, select the data point belonging to this particular site
            idx = np.where(np.bitwise_and(X_te[:,2] == sex, (df_te['site'] == site).to_numpy()))[0]
            
            # load the adaptation data
            y_ad = load_2d(os.path.join(idp_dir, 'resp_ad.txt'))
            X_ad = load_2d(os.path.join(idp_dir, 'cov_bspline_ad.txt'))
            idx_a = np.where(np.bitwise_and(X_ad[:,2] == sex, (df_ad['site'] == site).to_numpy()))[0]
            if len(idx) < 2 or len(idx_a) < 2:
                print('Insufficent data for site', sid, site, 'skipping...')
                continue
            
            # adjust and rescale the data
            y_te_rescaled, s2_rescaled = nm.blr.predict_and_adjust(nm.blr.hyp, 
                                                                   X_ad[idx_a,:], 
                                                                   np.squeeze(y_ad[idx_a]), 
                                                                   Xs=None, 
                                                                   ys=np.squeeze(y_te[idx]))
        plot the (adjusted) data points
        y_te_rescaled[y_te_rescaled < 0] = 0
        plt.scatter(rand_jitter(X_te[idx,1]), y_te_rescaled, s=4, color=clr, alpha = 0.1)
       
    # plot the median of the dummy data
    plt.plot(xx, med, clr)
    
    # fill the gaps in between the centiles
    junk, pr_int25 = W.warp_predictions(np.squeeze(yhat), np.squeeze(s2), warp_param, percentiles=[0.25,0.75])
    junk, pr_int95 = W.warp_predictions(np.squeeze(yhat), np.squeeze(s2), warp_param, percentiles=[0.05,0.95])
    junk, pr_int99 = W.warp_predictions(np.squeeze(yhat), np.squeeze(s2), warp_param, percentiles=[0.01,0.99])
    plt.fill_between(xx, pr_int25[:,0], pr_int25[:,1], alpha = 0.1,color=clr)
    plt.fill_between(xx, pr_int95[:,0], pr_int95[:,1], alpha = 0.1,color=clr)
    plt.fill_between(xx, pr_int99[:,0], pr_int99[:,1], alpha = 0.1,color=clr)
            
    # make the width of each centile proportional to the epistemic uncertainty
    junk, pr_int25l = W.warp_predictions(np.squeeze(yhat), np.squeeze(s2-0.5*s2s), warp_param, percentiles=[0.25,0.75])
    junk, pr_int95l = W.warp_predictions(np.squeeze(yhat), np.squeeze(s2-0.5*s2s), warp_param, percentiles=[0.05,0.95])
    junk, pr_int99l = W.warp_predictions(np.squeeze(yhat), np.squeeze(s2-0.5*s2s), warp_param, percentiles=[0.01,0.99])
    junk, pr_int25u = W.warp_predictions(np.squeeze(yhat), np.squeeze(s2+0.5*s2s), warp_param, percentiles=[0.25,0.75])
    junk, pr_int95u = W.warp_predictions(np.squeeze(yhat), np.squeeze(s2+0.5*s2s), warp_param, percentiles=[0.05,0.95])
    junk, pr_int99u = W.warp_predictions(np.squeeze(yhat), np.squeeze(s2+0.5*s2s), warp_param, percentiles=[0.01,0.99])    
    plt.fill_between(xx, pr_int25l[:,0], pr_int25u[:,0], alpha = 0.3,color=clr)
    plt.fill_between(xx, pr_int95l[:,0], pr_int95u[:,0], alpha = 0.3,color=clr)
    plt.fill_between(xx, pr_int99l[:,0], pr_int99u[:,0], alpha = 0.3,color=clr)
    plt.fill_between(xx, pr_int25l[:,1], pr_int25u[:,1], alpha = 0.3,color=clr)
    plt.fill_between(xx, pr_int95l[:,1], pr_int95u[:,1], alpha = 0.3,color=clr)
    plt.fill_between(xx, pr_int99l[:,1], pr_int99u[:,1], alpha = 0.3,color=clr)

    # plot actual centile lines
    plt.plot(xx, pr_int25[:,0],color=clr, linewidth=0.5)
    plt.plot(xx, pr_int25[:,1],color=clr, linewidth=0.5)
    plt.plot(xx, pr_int95[:,0],color=clr, linewidth=0.5)
    plt.plot(xx, pr_int95[:,1],color=clr, linewidth=0.5)
    plt.plot(xx, pr_int99[:,0],color=clr, linewidth=0.5)
    plt.plot(xx, pr_int99[:,1],color=clr, linewidth=0.5)
    
    plt.xlabel('Age')
    plt.ylabel(c) 
    plt.title(c)
    plt.xlim((xmin,xmax))
    plt.savefig(os.path.join(roi_path, 'centiles_' + str(sex)),  bbox_inches='tight')
    plt.show()
   
os.chdir(output_dir)

Now I am wondering whether rescaling the true data is necessary for my usecase. I have noticed that applying this code, the resulting data are shifted in the y direction compared to a plot of the raw data for some of the imaging-derived phenotypes I am investigating. Furthermore, the centile curves do not look as I would expect.

Here the plot resulting from the abovementioned code for plotting the WMH volume which appears as expected.
centiles_0

Here a simple scatterplot of the raw data.
scatter_imaging_wmh_volume

The problem is apparent when looking at other imaging-derived phenotypes.

Disconnectivity of the arcuate fascicle in percent using the abovementioned plotting code. Note that the data is shifted upwards (y interval is 20-120% instead of 0-100%).
centiles_0

And the corresponding raw data plot.
scatter_AF_R

The same plots for the peak width of skeletonized mean diffusivity (PSMD).
centiles_0
scatter_peak_width_of_skeletonized_mean_diffusivity

Are there some assumptions that need to be met to apply warped BLR when modelling a variable, like is non-gaussianity of residuals required? One difference to the tutorials is that I use a 2-fold cross-validation via estimate() to get zscores for all individuals. I noticed that the resulting centile curves differ relevantly if I rerun the analysis. Maybe because of probabilistic sampling of training and test set during CV?

The complete code I use can be found in this jupyter notebook: https://drive.google.com/file/d/1p0jHzDC832yVKWd7p0PULgfhbnnYUi0F/view?usp=sharing.

I would be very grateful to get some help. Happy to provide further information if required.

Thanks a lot in advance!

Marvin

HBR model estimation failing: <class 'IndexError'> normative.py 428

I've been getting the below error when running HBR estimate. I got the error with my own data but after not understanding why it might be failing, I ran the notebook completely unchanged from the HBR demo which reproduces exactly the error.

The models run without error when no batch effects are specified and thought it might be something in the estimate function but not sure where/why from there.

I've run into the same error in a fresh conda environment on a Ubuntu instance and also on my Mac locally.

Thanks for your help!

ptk.normative.estimate(covfile=covfile, 
                       respfile=respfile,
                       tsbefile=tsbefile, 
                       trbefile=trbefile, 
                       alg='hbr', 
                       log_path=log_dir, 
                       binary=True,
                       output_path=output_path, testcov= testcovfile_path,
                       testresp = testrespfile_path,
                       outputsuffix=outputsuffix, savemodel=True)
Processing data in /Users/jakepalmer/Desktop/PCNtoolkit-demo/tutorials/HBR_FCON/HBR_demo/Y_train.pkl
Estimating model  1 of 2
Model  1 of 2 FAILED!..skipping and writing NaN to outputs
Exception:
index out of bounds
<class 'IndexError'> normative.py 428
Estimating model  2 of 2
Model  2 of 2 FAILED!..skipping and writing NaN to outputs
Exception:
index out of bounds
<class 'IndexError'> normative.py 428
Saving model meta-data...
Evaluating the model ...
Writing outputs ...

error in importing theano

Hi, when i import theano i get the following error: ImportError: cannot import name 'is_same_graph' from 'theano.gof.toolbox'. I tried also re-installing theano from scratch but still can't fix this issue. Does it depend from the theano installation or somehing else?

Numerical issue and bug

Function "squared_dist" in "utils.py" will cause some numerical issue when x and z are large but the difference is small. I used function "distance_matrix" from scipy instead.

Line 209 in "normative.py" should be:
nz = np.where(np.bitwise_and(np.isfinite(Y).all(axis=0),np.var(Y, axis=0) != 0))[0]

problem when using 'predict' for a HBR model

Hi,
I am using PCNtoolkit 0.28 to estimate a HBR normative model with health data and predict the 'yhat s2 Z' value with data (from patients) which from two sites existed in the training step, while some error occurs as follows:

Loading data ...
Prediction by model 1 of 70

Sampling: [y_like]

100.00% [4000/4000 00:02<00:00]


ValueError Traceback (most recent call last)
Cell In[29], line 2
1 from pcntoolkit.normative import predict
----> 2 yhat_new, s2_new = predict(covfile = covfile_new,
3 alg = 'hbr',
4 respfile = respfile_new,
5 tsbefile = tsbefile_new,
6 model_path = os.path.join(processing_dir,'Models'),
7 output_path = output_path,
8 outputsuffix = '_predictnew')

File ~\anaconda3\envs\nor_PCN\Lib\site-packages\pcntoolkit\normative.py:766, in predict(covfile, respfile, maskfile, **kwargs)
763 yhat, s2 = nm.predict_on_new_sites(Xz, batch_effects_test)
765 if outscaler == 'standardize':
--> 766 Yhat[:, i] = scaler_resp[fold].inverse_transform(yhat, index=i)
767 S2[:, i] = s2.squeeze() * sY[fold][i]**2
768 elif outscaler in ['minmax', 'robminmax']:

ValueError: could not broadcast input array from shape (559,) into shape (1077,)

By the way, 559 is the size of test data I use to estimate the model, and 1077 is the size of data I need to predict, I check the size of covfile_new, respfile_new and tsbefile_new, they all have 1077 subjects

and this is the code I use for predict:

from pcntoolkit.normative import predict
yhat_new, s2_new, Z_new = predict(covfile = covfile_new,
alg = 'hbr',
respfile = respfile_new,
tsbefile = tsbefile_new,
model_path = os.path.join(processing_dir,'Models'),
output_path = output_path,
outputsuffix = '_predictnew')

I wanna if anyone could kindly help me out of this :)

Failed to import pcntoolkit with numpy>=1.23

Hello,

pcntoolkit fails to import with numpy version 1.23 or later (see full traceback below). Downgrading to numpy==1.22.4 fixes the issue. It looks like pymc==3.9.3 calls numpy.asscalar, which was removed from the numpy library in version 1.23 (it's deprecated but still present in 1.22).

Traceback (most recent call last):
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/model/gp.py", line 15, in <module>
    from pcntoolkit.util.utils import squared_dist
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/util/__init__.py", line 1, in <module>
    from . import utils
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/util/utils.py", line 17, in <module>
    import pymc3 as pm
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/__init__.py", line 41, in <module>
    from .distributions import *
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/distributions/__init__.py", line 15, in <module>
    from . import timeseries
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/distributions/timeseries.py", line 22, in <module>
    from pymc3.util import get_variable_name
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/util.py", line 21, in <module>
    from numpy import asscalar, ndarray
ImportError: cannot import name 'asscalar' from 'numpy' (/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/numpy/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/trendsurf.py", line 21, in <module>
    from pcntoolkit.model.bayesreg import BLR
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/model/__init__.py", line 2, in <module>
    from . import gp
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/model/gp.py", line 25, in <module>
    from util.utils import squared_dist
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/util/__init__.py", line 1, in <module>
    from . import utils
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/util/utils.py", line 17, in <module>
    import pymc3 as pm
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/__init__.py", line 41, in <module>
    from .distributions import *
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/distributions/__init__.py", line 15, in <module>
    from . import timeseries
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/distributions/timeseries.py", line 22, in <module>
    from pymc3.util import get_variable_name
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/util.py", line 21, in <module>
    from numpy import asscalar, ndarray
ImportError: cannot import name 'asscalar' from 'numpy' (/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/numpy/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/model/gp.py", line 15, in <module>
    from pcntoolkit.util.utils import squared_dist
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/util/__init__.py", line 1, in <module>
    from . import utils
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/util/utils.py", line 17, in <module>
    import pymc3 as pm
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/__init__.py", line 41, in <module>
    from .distributions import *
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/distributions/__init__.py", line 15, in <module>
    from . import timeseries
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/distributions/timeseries.py", line 22, in <module>
    from pymc3.util import get_variable_name
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/util.py", line 21, in <module>
    from numpy import asscalar, ndarray
ImportError: cannot import name 'asscalar' from 'numpy' (/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/numpy/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/__init__.py", line 1, in <module>
    from . import trendsurf
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/trendsurf.py", line 30, in <module>
    from model.bayesreg import BLR
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/model/__init__.py", line 2, in <module>
    from . import gp
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/model/gp.py", line 25, in <module>
    from util.utils import squared_dist
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/util/__init__.py", line 1, in <module>
    from . import utils
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pcntoolkit/util/utils.py", line 17, in <module>
    import pymc3 as pm
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/__init__.py", line 41, in <module>
    from .distributions import *
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/distributions/__init__.py", line 15, in <module>
    from . import timeseries
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/distributions/timeseries.py", line 22, in <module>
    from pymc3.util import get_variable_name
  File "/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/pymc3/util.py", line 21, in <module>
    from numpy import asscalar, ndarray
ImportError: cannot import name 'asscalar' from 'numpy' (/opt/homebrew/anaconda3/envs/norm-test-conda/lib/python3.10/site-packages/numpy/__init__.py)

Bug when loading PCN

Hi,

During the (local) installation, I encountered an error when trying to load the PCN module in Python:

>>> import pcntoolkit as pcn
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ROBARTS/rhaast/PCNtoolkit/pcntoolkit/__init__.py", line 1, in <module>
    from . import trendsurf
  File "/home/ROBARTS/rhaast/PCNtoolkit/pcntoolkit/trendsurf.py", line 21, in <module>
    from pcntoolkit.model.bayesreg import BLR
  File "/home/ROBARTS/rhaast/PCNtoolkit/pcntoolkit/model/__init__.py", line 6, in <module>
    from . import hbr
  File "<fstring>", line 1
    (dist=)

I was able to fix this by removing = from the expression in the curly braces on lines

print(f"{dist=}")

and
print(f"{params=}")

After this, the PCN module loads fine.
Roy

Interpretation of CI in GPR model

Hello! First of all, thanks for creating these wonderful tools.

I have what may be more of a conceptual question related to the interpretation of the confidence intervals produced by the software rather than to the technical implementation. But please bear with me.

Here it is: In the GPR example there are about 650 participants that are being analyzed. When producing 99% CIs, I would have expected to have about 1% of the subjects outside of these bounds, but in the figures that are produced in the example, there are clearly more than 7 participants that are outside of the CI in each of the ROIs that are displayed. What does that mean? Is my expectation of 1% participants outside of the 99% CI misplaced? Are these bounds somehow smaller than I would have expected, or need some kind of correction to simultaneously cover the full range (a la this for GAMs)?

Thanks in advance for any thoughts you have about this!

about GPR model problem

Hi
Now, I have encountered a problem with the GPR model. I use estimation functions to save the model. If I want to use the savemodel for prediction in a new patient dataset, how can I add the savemodel pat to the estimation function.

thanks

a question about transfer

Hello, I'm using pcntoolkit version 0.28. When I use the transfer function in pcntoolkit.normative, I encounter an issue where the results are uncertain. Sometimes the results are fine, but at other times, the SMSE_transfer value becomes extremely large, to the point of being unrealistic. Other results, including z-scores, are also incorrect.

I've debugged my program and identified the problem. It seems that on line 998 and 1007 of pcntoolkit.normative.transfer, there are calls to estimate_on_new_sites and predict_on_new_sites, respectively. However, at this point, the parameter X passed to both functions becomes identical, leading to incorrect results.

I've attached an image and provided my program's code below. Could you please help me identify what might be causing this issue?
image
image
image

Installation requires Theano>=1.0.5

I got the following error after running python3 setup.py install:

Installed /users/jflournoy/.conda/envs/normative_modeling/lib/python3.8/site-packages/Theano-1.0.4-py3.8.egg
error: Theano 1.0.4 is installed but theano>=1.0.5 is required by {'pymc3'}

Changing the version required in the setup.cfg solved the issue, I think.

Add FAQ page

The documentation also has an empty page currently for Frequently Asked Questions (FAQ). We should brainstorm what questions are typically asked by new users and fill in this page (docs/source/pages/FAQ.rst) accordingly.

Standardisation with warping

Hello,

First, thank you for the great work on this toolbox!
I'm using Bayesian Linear Regression with SinhArcsinh warping for my project with as response variables the Freesurfer's outputs of brain MRIs.

I possibly wanted to standardise my covariates and response variable and tried to use the outscaler and inscaler arguments directly accessible in the toolbox and was confronted to some errors before noticing that there was a mention "TODO: Warping for scaled data" and "outscaler not yet supported warping" in the code of the toolbox in the estimate function.

So I wonder:

  1. do you recommend standardizing the data or is it useless in this case?
  2. If I can get around the problem by standardizing my data before calling the toolbox functions?

Thank you very much for your help!

Problem calculating z-score in predict

Hello author, when I was debugging the normative.predict function, I found that the steps for calculating z-score were inconsistent with what was written in the paper. When calculating S2 in the code, two variances were multiplied, while what was written in the paper was the multiplication of two variances. Plus, I want to confirm with you which calculation is correct.

What is presented in the code is this:
image
What is presented in the paper is this:
image

Looking forward to your reply!

conflict with python==3.10

Hello,

There appears to be a dependency issue when following the Basic Installation instructions.

Creating an environment without specifying python version in a recent conda distribution (22.9.0) creates the environment using Python 3.10 by default. The command conda install pip pandas scipy runs without errors. However, when I then run pip install pcntoolkit, it results on the following error:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fsleyes 1.5.0 requires pyparsing==2.*, but you have pyparsing 3.0.9 which is incompatible.

When I try to install the requested pyparsing version with pip install pyparsing==2, I get the following error:

      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-5pqi43ke/pyparsing_b012152b99344c788507293b4939c673/setup.py", line 9, in <module>
          from pyparsing import __version__ as pyparsing_version
        File "/tmp/pip-install-5pqi43ke/pyparsing_b012152b99344c788507293b4939c673/pyparsing.py", line 573, in <module>
          collections.MutableMapping.register(ParseResults)
      AttributeError: module 'collections' has no attribute 'MutableMapping'

which I believe is caused by a change in the built-in collection module in Python 3.10. I did not have this issue when I used Python 3.9 or 3.8.

Proposed solution

Adding python_requires='<3.10', into the setup function in setup.py and specifying the environment creation in the docs as conda create --name <env_name> "python<3.10" can solve this issue for users until the dependencies change their requirements.

I was going to make a pull request, however the Sphinx version on my machine renders html slightly differently, leading to an unnecessary amount of diff lines.

Thanks,
Can

Error when testing up PCNtoolkit-0.18

After installing, I get the following when trying to load the package

-bash-4.2$ python -c "import pcntoolkit as pk;print(pk.file)"
Traceback (most recent call last):
File "", line 1, in
File "/cluster/projects/p33/conda/normative_modeling/0.18/PCNtoolkit-0.18/pcntoolkit/init.py", line 3, in
from . import gp
File "/cluster/projects/p33/conda/normative_modeling/0.18/PCNtoolkit-0.18/pcntoolkit/gp.py", line 15, in
from pcntoolkit.utils import squared_dist
File "/cluster/projects/p33/conda/normative_modeling/0.18/PCNtoolkit-0.18/pcntoolkit/utils.py", line 17, in
import pymc3 as pm
File "/cluster/projects/p33/conda/normative_modeling/0.18/lib/python3.7/site-packages/pymc3-3.11.1-py3.7.egg/pymc3/init.py", line 39, in
__set_compiler_flags()
File "/cluster/projects/p33/conda/normative_modeling/0.18/lib/python3.7/site-packages/pymc3-3.11.1-py3.7.egg/pymc3/init.py", line 35, in __set_compiler_flags
current = theano.config.gcc__cxxflags
AttributeError: 'TheanoConfigParser' object has no attribute 'gcc__cxxflags'

Problem in apply_normative_models.ipynb with colab

Hi,

I tried to run the apply_normative_models.ipynb with colab, when I run this part:

from pcntoolkit.normative import estimate, predict, evaluate

I got:

RuntimeError:
Could not import 'mkl'. If you are using conda, update the numpy
packages to the latest build otherwise, set MKL_THREADING_LAYER=GNU in
your environment for MKL 2018.

If you have MKL 2017 install and are not in a conda environment you
can set the Theano flag blas.check_openmp to False. Be warned that if
you set this flag and don't set the appropriate environment or make
sure you have the right version you will get wrong results.

which was still working days before. Not sure if this happens because there might have been some updates in Colab.
I tried to install mkl again for different versions but did not work.

Syntax error in normative.py (and a fix)

Hi!
I am not sure if this is a specific python version problem that does not accept this syntax, but I followed the tutorial on readthedocs which uses python 3.7.7 in a new conda environment, so I did this too.
After installing the required additional packages, trying to import pcntoolkit gave me this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../pcntoolkit/__init__.py", line 2, in <module>
    from . import normative
  File "<fstring>", line 1
    (kwargs=)
           ^
SyntaxError: invalid syntax

This seems to originate from line 863 in pcntoolkit/normative.py, print(f'{kwargs=}'). After commenting this print statement out, I can import without problems.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.