ramp-kits / autism Goto Github PK
View Code? Open in Web Editor NEWData Challenge on Autism Spectrum Disorder detection
Home Page: https://paris-saclay-cds.github.io/autism_challenge/
Data Challenge on Autism Spectrum Disorder detection
Home Page: https://paris-saclay-cds.github.io/autism_challenge/
Typical neuroimaging pipelines work from images. Hence it would be interesting to turn the extracted signals into images.
The top 10 submission or the top 10 teams will win the prizes?
Hi
In the anatomical features for parts of the brain in the LH and RH, i can spot the area and thickness measures (on running the starter kit) but it is not clear to me which labels correspond to the volumes of each of these regions.
The thickness and area labels are labelled like: anatomy_lh_thickness and anatomy_lh_area for lh and rh. I can see some volume measures with the 'Vol' substring. There are others called 'holes'. So, other than the area and thickness measures, the volume measure seems less in number (Freesurfer says there are 40 subcortical regions). Can someone pl explain how to interpret the volumes and holes measures ?
On the webpage, ramp_test_workflow
is mentioned.
However, I am not sure whether executing this workflow is actually running the Jupyter notebook, executing ramp_test_submission
at the end of the notebook, or executing something else which would be found in the Ramp-Worflow repository.
Hi, I've signed up to ramp.studio but when I click on the event link on https://ramp.studio/problems/autism, it says "no event named "autism""
Some submissions may need specific dependencies. What is the proper way to make sure that they are met?
Hi, is it against the rules use information of others datasets, for example, to help in the feature selection?
add gcn package to standard environment
gcn
https://github.com/parisots/gcn
and dependencies of this package
tensorflow (>0.12)
networkx
joblib
Hi,
I was wondering if you could provide a couple more files so one can work with the cortical surfaces (*h.white, *h.pial, *h.sphere, *h.sphere.reg and talairach.xfm) as for now, features describing the shape of the brain can not be incorporated in the model and it's been previously shown that this is something relevant in autism.
Thanks a lot,
Pierre
Since I am more efficient working in R than Python, can I implement my algorithms in R/RStudio ( including replicating the starter kit).
Please guide.
It is said in the doc that "For each subject, we preliminary extracted signals using different brain parcellations and atlases and accounting for motion correction and confounds." but while going through the script
preprocessing/extract_time_series.py it looks like the saved .csv files are obtained without confounds regression
https://github.com/ramp-kits/autism/blob/master/preprocessing/extract_time_series.py#L233
Am I misunderstanding something ?
Dear all,
First of all, thank you for your efforts and publishing the dataset.
I am just wondering is whether or not the WM/CSF are regressed out. It's a pretty common practice for resting state fMRI if I'm not mistaken.
Thanks,
Makis
I'm using Arch linux with miniconda from AUR and I am aksing you guys how many gigs did you needed to build this enviroment?
When I did some feature selection process in FeatureExtractor() or Classifier() based on some relationship between X and y. I realized different X_split and y_split in each cv fold make the selected features different for each cv fold. This gives different auc. So I was thinking would it be better for the FeatureExtractor() outside of cross_validation?
Something like:
features=FeatureExtractor()
X_new=features.fit_transform(X,y)
cross_validate(Classifier(), X_new, y)
Not hundred sure about this. Maybe if the feature selection is stable enough, then the variance from this could be ignored.
When I tried to build FC maps with motion regressed out and compute the graph measures for the FC, it took 10 mins to finish the feature extraction. If the FeatureExtractor() is outside the cross_validation process, it would save a lot of time. This is just a small issue as I gave up building FC myself and calculating graph.
Hello,
Thank you for organizing this challenge in the first place, it has been a great dataset to use. I know in issue #36 you addressed not making additional metadata while the challenge was running, but I was wondering if you would be able to now that the challenge has closed. I know you have made the TR available already. Specifically, I was wondering if you could make the following parameters available:
TE for the different imaging sites
Eyes being open, closed, or were blink artifacts removed?
What voxel sizes were used in the structural imaging?
What was the flip angle of the imaging sequence?
Thank you again,
-Cooper Mellema
Hi
I was wondering whether you can give more details about the correspondence between time series and brain regions when given a specified atlas.
As far as I know, I can get a DataFrame object by using pandas to load a subject's fMRI data(*.csv files). I think the columns mean time series extracted from brain regions, is that right? Besides, given a specific time series, how to get the corresponding anatomic label?
I am considering using priori knowledge to do feature-selection. If you can answer these questions, I will be very grateful. Thank you. @GaelVaroquaux @glemaitre
Dear all,
first of all thanks for organizing the challenge, it's a blast getting started with this :)
I was looking into creating features from the resting state data (not necessarily based on functional connectivity) and saw that the time series are of different length. Which is in a clinical setting with different sites definitely understandable.
However, now I am wondering whether the data was acquired using the same BOLD sequences. Or whether there are some information on acquisition parameters.
Especially the TR, which would be necessary / useful for applying some temporal filtering on the data or using temporal features.
Best,
Simon
(I hope I just didn't overlook the info somewhere...)
We should make download_data.py
more than a script.
Create a function fetch_fmri_time_series(atlas='all')
which can be called in the __main__
to be executed.
How should we deal with the best_submission
branch?
Should we keep it as an independent branch and merge master into it or merge it to the master branch?
Also, two of the best submissions now fail because of scikit-learn LogisticRegression
classifier evolution. Should we freeze scikit-learn version in order to maintain compatibility?
Hi,
I just wanted to ask what will happen to these data after the challenge? Also, is it possible to use these data for other purposes? For example, suppose one comes up with some novel idea during the challenge and would like to publish that (in collaboration with the owners of the data). Would it be possible?
Br,
Tuomas
I think that we want to avoid mail as much as possible and direct people to ask questions on the github issues.
Hence the last section of the notebook should be changed.
I am not doing it for fear of generating conflicts because of different Jupyter versions.
The data shared is a set of signals extracted on brain parcellations. However, it might be interesting to work from raw MR images. Are these available?
Can somebody with a runnable virtual environment just list the version of Python and all the versions of the dependencies? I have many issues trying to find the right versions of all the packets. Two years of updates did the damage.
We need to update the public and private data using the this atlas.
Now that the competition is over, are the source codes available? just to reproduce the results? and to see how it works?
please update h5py module at the server,
because I get this "training_error" and I don't know what is reason. It works at local test.
I hop to solve it with update of h5py module.
can you help me to solve it with some else solutions ?
Thanks for posting such an interesting challenge. I am currently working as a Ph.D. student in the Digital Health group (https://hpi.de//boettinger/home.html). I was wondering the results we obtain during the whole process ( Insights + analysis + results) can be published in collaboration with the people who originally collected and put together the data.
Looking forward to your reply.
I am thinking of using a DL framework but I am wondering if using such framework is allowed or installed on the server.
python download_data.py basc122
Downloading the data from https://storage.ramp.studio/autism/basc122.zip ...
Traceback (most recent call last):
File "download_data.py", line 170, in <module>
fetch_fmri_time_series(atlas)
File "download_data.py", line 153, in fetch_fmri_time_series
_check_integrity_atlas(atlas)
File "download_data.py", line 106, in _check_integrity_atlas
_download_fmri_data(atlas)
File "download_data.py", line 77, in _download_fmri_data
_check_and_unzip(output_file, atlas, atlas_directory)
File "download_data.py", line 62, in _check_and_unzip
raise IOError('The file downloaded was corrupted. Try again '
Is it possible to have a correspondance-dictionary between the .csv columns and ROIs localization? I.e column1 is the mean value for a ROI in an area of the prefrontal cortex. Probably the original atlas (or the version inside nilearn) have this information, but it would be nice to have the .nii.gz file to visually exploit spatial localization.
This could be useful to perform seed-analysis, were the seed localization is important. I saw your response in #7 but dont know if it would work (using nilearn's to_filename function). My concern is if the ROI with value 1 will belong to the column 1 of the .csv file.
Thanks for all the help!
This is an issue with the RAMP website. There is a problem called autism, but no event.
Clicking on the link:
leads to:
Based on the instructions, my personal comprehension is that we have to provide you the two basic functions, FeatureExtractor( ) and Classifier( ). I would like to access the whole data and exclude some of them, so afterwards I'll have to exclude their corresponding labels, as well. I can exclude the data based on the condition each time the FeatureExtractor is called but I can't do the same for the labels through it. So my question is if we will have to execute all the commands before FeatureExtractor is called (because that would solve my problem) or not.
Jupyter is mentioned in README.md
but not in requirements.txt
.
Hi
Can you please tell where we can find some of the best solutions submitted for this challenge ?Can you please post the winning code and approaches.
Hi,
I was just wondering whether somewhere there is a more detailed and structured description of data features? For example, what is participants_site
?
Thanks
Hello,
Would it possible to have more information about the "motions" measures? Indeed, it doesn't seem to be a parcellation technique as described in the project documentation.
Regards,
Vincent
Hi,
I am working on the competition data after attended the challenge. May I ask a quick question regarding preprocessing? Do you mind specify what kind of preprocessing you did for rs-fMRI and T1 (especially for rs-fMRI)? For example, specific parameters for slice timing, realignment, registration, smoothing, frequency band, etc.
Looking forward to hearing from you. Thanks a lot!
Best,
Jongwoo Choi
Hi,
I am trying to run the starting kit but got the following error for the evaluation
function
results = evaluation(data_train, labels_train)
>
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-21-1247bf8432df> in <module>()
----> 1 results = evaluation(data_train, labels_train)
2
3 print("Training score ROC-AUC: {:.3f} +- {:.3f}".format(np.mean(results['train_roc_auc']),
4 np.std(results['train_roc_auc'])))
5 print("Validation score ROC-AUC: {:.3f} +- {:.3f} \n".format(np.mean(results['test_roc_auc']),
<ipython-input-17-e7b8911b304f> in evaluation(X, y)
8 results = cross_validate(pipe, X, y, scoring=['roc_auc', 'accuracy'], cv=cv,
9 verbose=1, return_train_score=True,
---> 10 n_jobs=1)
11
12 return results
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/model_selection/_validation.pyc in cross_validate(estimator, X, y, groups, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch, return_train_score)
204 fit_params, return_train_score=return_train_score,
205 return_times=True)
--> 206 for train, test in cv.split(X, y, groups))
207
208 if return_train_score:
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self, iterable)
777 # was dispatched. In particular this covers the edge
778 # case of Parallel used with an exhausted iterator.
--> 779 while self.dispatch_one_batch(iterator):
780 self._iterating = True
781 else:
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in dispatch_one_batch(self, iterator)
623 return False
624 else:
--> 625 self._dispatch(tasks)
626 return True
627
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in _dispatch(self, batch)
586 dispatch_timestamp = time.time()
587 cb = BatchCompletionCallBack(dispatch_timestamp, len(batch), self)
--> 588 job = self._backend.apply_async(batch, callback=cb)
589 self._jobs.append(job)
590
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/_parallel_backends.pyc in apply_async(self, func, callback)
109 def apply_async(self, func, callback=None):
110 """Schedule a func to be run"""
--> 111 result = ImmediateResult(func)
112 if callback:
113 callback(result)
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/_parallel_backends.pyc in __init__(self, batch)
330 # Don't delay the application, to avoid keeping the input
331 # arguments in memory
--> 332 self.results = batch()
333
334 def get(self):
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self)
129
130 def __call__(self):
--> 131 return [func(*args, **kwargs) for func, args, kwargs in self.items]
132
133 def __len__(self):
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/model_selection/_validation.pyc in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, error_score)
456 estimator.fit(X_train, **fit_params)
457 else:
--> 458 estimator.fit(X_train, y_train, **fit_params)
459
460 except Exception as e:
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/pipeline.pyc in fit(self, X, y, **fit_params)
246 This estimator
247 """
--> 248 Xt, fit_params = self._fit(X, y, **fit_params)
249 if self._final_estimator is not None:
250 self._final_estimator.fit(Xt, y, **fit_params)
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/pipeline.pyc in _fit(self, X, y, **fit_params)
211 Xt, fitted_transformer = fit_transform_one_cached(
212 cloned_transformer, None, Xt, y,
--> 213 **fit_params_steps[name])
214 # Replace the transformer of the step with the fitted
215 # transformer. This is necessary when loading the transformer
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/memory.pyc in __call__(self, *args, **kwargs)
360
361 def __call__(self, *args, **kwargs):
--> 362 return self.func(*args, **kwargs)
363
364 def call_and_shelve(self, *args, **kwargs):
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/pipeline.pyc in _fit_transform_one(transformer, weight, X, y, **fit_params)
579 **fit_params):
580 if hasattr(transformer, 'fit_transform'):
--> 581 res = transformer.fit_transform(X, y, **fit_params)
582 else:
583 res = transformer.fit(X, y, **fit_params).transform(X)
/home/salma/anaconda2/lib/python2.7/site-packages/sklearn/base.pyc in fit_transform(self, X, y, **fit_params)
518 else:
519 # fit method of arity 2 (supervised transformation)
--> 520 return self.fit(X, y, **fit_params).transform(X)
521
522
<ipython-input-18-6d322cc43f0d> in transform(self, X_df)
10 # get only the anatomical information
11 X = X_df[[col for col in X_df.columns if col.startswith('anatomy')]]
---> 12 return X.drop(columns='anatomy_select')
TypeError: drop() got an unexpected keyword argument 'columns'
Do the private dataset have new sites? Or all of them have at least one sample in the available data?
Hi,
I used conda env create -f environment.yml
to install packages, then source activate autism
. ramp_test_submission
works great but ramp_test_notebook
gives me the following error:
> ----------------------------
> Testing if the notebook can be converted to html
> Testing if the notebook can be executed
> Traceback (most recent call last):
> File "/home/luis/anaconda3/envs/autism/lib/python3.6/runpy.py", line 193, in _run_module_as_main
> "__main__", mod_spec)
> File "/home/luis/anaconda3/envs/autism/lib/python3.6/runpy.py", line 85, in _run_code
> exec(code, run_globals)
> File "/home/luis/anaconda3/envs/autism/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in <module>
> app.launch_new_instance()
> File "/home/luis/anaconda3/envs/autism/lib/python3.6/site-packages/traitlets/config/application.py", line 657, in launch_instance
> app.initialize(argv)
> File "<decorator-gen-123>", line 2, in initialize
> File "/home/luis/anaconda3/envs/autism/lib/python3.6/site-packages/traitlets/config/application.py", line 87, in catch_config_error
> return method(app, *args, **kwargs)
> File "/home/luis/anaconda3/envs/autism/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 452, in initialize
> zmq_ioloop.install()
> File "/home/luis/.local/lib/python3.6/site-packages/zmq/eventloop/ioloop.py", line 210, in install
> assert (not ioloop.IOLoop.initialized()) or \
> AttributeError: type object 'IOLoop' has no attribute 'initialized'
Will the above statistical packages be available on the test server or statistical testing tools have to be installed locally for (structural) anatomical feature analysis/extraction ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.