berkeley-stat159 / project-alpha Goto Github PK
View Code? Open in Web Editor NEWLicense: BSD 3-Clause "New" or "Revised" License
License: BSD 3-Clause "New" or "Revised" License
on voxels :) (++ comments)
Could someone (like Jane) update that data/Makefile test file :)
Please add buttons to README.
@janewliang @hiroto-udagawa @reychil @kentschen
smooth_final.py
convolution_final.py
glm_final_fourier.py
bh_t_beta_final_fourier.py
image_overlay_final.py
selection_final_fourier.py
parameter_selection_final_fourier.py
Great work. Keep it up. Let's know how you can help.
I know you are aware of this, but you need to make it clear to us and
documented that everyone is contributing. I know you are all participating,
but you should consider trying to switch who is generating code to mostly
reviewing code (or some other approach).
The different condition files to run multiple linear regression, explore results
Code up time correction and include good comments
(with some help from Ben maybe)
Sorry --- I skipped you guys originally since you got a lot of time with Jarrod and Matthew on Tuesday. Here's my feedback from the paper:
I really wish somebody would update these slides
Create function for PCA, test file, and script by Monday
Extend simple regression with auto correlation and examine other time series analysis of hemoglobin response (goal competition Monday)
Create function to smooth 3d images, creating mega-voxels (goal competition Monday)
When I clone your project and run make test
, I get 5 out of 9 test failures.
I've attached the error message I get. Take a look and see if you can figure out what the issue is.
NOTE: You guys are very active on your project --- this is intended to be an exercise for lab on Monday, so please don't resolve the issue before then :)
Begin cross subject comparison, looking into R^2, RSS, \betas and other statistics to compare observations in a single person to between subject comparison (Goal, initial effort competition by Monday)
Dear all,
I hope this finds you well. Having spend 12+ hours after our meeting on Wednesday cleaning up and finding why overlapping of our Benjamini Hochberg multiple comparison of p-values, and the other thresholds using upper quantiles of abs(t-statistic) and abs(beta) were not providing accuracy results to find locations of activations regions.
My first conclusions where 2 fold;
(1) that the brains fail to line up enough that a lot of the area interpretation is lost via this problem alone and
(2) the certain subjects linear regression models overfit, and inclusion of 6 principle components created collinearity with our predicted Hemodynamic response function. These generally occurred when the variance explained by these principle components where much larger than our lower threshold of .4 (40% variance explained) ** recall this is all in terms of principle components of the A^T A matrix , where A is the masked voxel time courses.
In attempts to correct for this problem (of overfitting/collinearity), and with a firm understanding that the 6 fourier features would fail to introduce such collinearity and explanation of variance, I ran models with 6 fourier features instead. Let’s call this model “_fourier”.
For the subjects with ~40% variance explained by the first 6 principle, the beta values and t-statistics from this new model (“_fourier”) compared to the old model (with the 6 principle components “_pca”). Impressed by this outcome (as we saw earlier that _fourier was less able to predicted the BOLD response than the _pca model in the model selection analysis). I decided to implement the analysis for all subjects
While trying to find a single set of parameters to utilize across all subjects for all of these analyses, I discovered that for Benjamini Hochberg the Q value was much more variable or each subject than the other threshold analysis. And as such, even though Benjamini Hochberg is more theoretically sound, it is less strong in our case.
Recall that Benjamini Hochberg requires a Q (generally thought of as an \alpha value) and # of neighbors selection for neighbor smoothing, and our threshold statistics require an quantile value of the proportion of values saved, and a # of neighbors for neighbor smoothing.
In general, the data fails to provide many regions of activation, as the only current method for such analysis is now requires per subject analysis separately (by eye), the charts to find these patterns can be created by running “image_outlay_final.py”. As such, our conclusions should probably only include identification of the frontal lob (@KentChen, I still need the full range of possibilities of actual activation locations created for at “sub001” at least)
In general the only thing that sticks out to me is the frontal cortex as previously noted. #BasicBro
******* Please review/comment and add my stuff
Need to write the final draft of the paper and do some organizing. We maybe want to address Ross's Piazza post answer and squish the appendices into a single PDF with the main report. Ideally, we'd all have things written a day or two before the deadline, so there is time to revise and proofread and every section gets more than one pair of eyes checking it.
Also need to write the scripts to generate the images.
Just make sure that shit work
In text file, add name, title, link to paper. In bibliography, please add in the URL/link
Need to check that all the requirements are listed in requirements.txt and that all the Makefile recipes run and make sense. Should also update the READMEs so that every directory has a README that outlines the Makefile recipes, what the directory stores, etc. May want to do some organizing to delete useless files and consolidate where the user-generated images and data files get saved.
Start formatting/ exploring approach the FINAL script
Hey guys,
I tried make analysis
and it fails at a certain point because at least one bold.nii.gz doesn't get unpacked by make data
.
If any of you are not getting killed by finals I'd appreciate if you could take a look. If you're all super busy that's fine, I'll work it out on my own. Let me know if you'll have a chance to look at it today
ALL the code needs to be reviewed. Check if it runs, makes sense, and has good comments.
PLEASE FINISH WRITING YOUR ASSIGNED SECTIONS BEFORE SUNDAY, 11/29, so Jane can proofread/organize as needed. If your section already exists in some form from the previous draft, the .tex file for the section is just copied over. If not, it will be blank.
In general, try to cite more frequently, write more detailed captions (e.g. you can tell what's going on in the figure based on just reading the caption, as opposed to a one-line title), write "p-value" in quotes, and it's hemodynamic, not hemoglobin.
In addition to the assigned sections outlined below, we also need to go over/update the abstract, intro, data, and discussion sections. Jane is happy to do that when she organizes/proofreads on Sunday, but you are all more than welcome to contribute too.
Methods:
Smoothing (Rachel)
Convolution and time correction (Ben)
GLM (Jane)
Normality checks/assumptions (Kent)
Hypothesis testing (Hiro)
Benjamini-Hochberg (Rachel)
Clustering (Hiro)
Results:
GLM (Jane/Ben)
Hypothesis testing -- normality and Benjamini-Hochberg are lumped in here too (Hiro/Rachel/Kent)
Clustering (Hiro)
Appendix (If you can fit everything you want to say gracefully in the main paper, it's fine not to do these):
Convolution analysis (Ben)
Benjamini-Hochberg (Rachel)
Clustering (Hiro)
Time series (Jane)
NOTE: If you are using citations in your appendix section, please uncomment the 3 other lines in the make file associated with your section. If you're not using citations, it's fine to leave the make file as is and I'll clean it up. But to get your citation references to render correctly, you'll need to uncomment the 3 other lines for your section.
Final approach to Convolution
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.