Git Product home page Git Product logo

labhacks's Introduction

Data analysis

  • Start every new research project with a clear and universal directory structure for organizing your analysis, data and figures. Here is a template as an example you can follow for a transparent directory structure.

  • Download Atom. It is a very powerful and free! editor that integrates nicely with github. Use it for writing text, markup, code, scripts, etc.

  • Use jupyter notebooks for development and for analysis pipelines. Install Kyle Dunovan's jupyter themes to make your notebooks pretty and work faster.

  • Consider version-controlling your data along with your code.

  • Make an "autopilot" script for your analyses, so that figures can be updated in real time. Then write a cron job to execute the analysis script so that newly collected data is automatically integrated perhaps with an email summarizing the results sent to you or your advisor. Example here.

  • Make a startup file for your jupyter notebooks that preloads modules like numpy and scipy and figure specifications so they are consistent and pub ready. The config file can specify font sizes, legends, color themes etc.

  • Learning data science? Here is an extremely well curated series of quick references for data science in python (numpy, scipy, pandas), ML algorithms, probability and more

  • Diving into deep learning? Here is a similar reference for machine learning (ML) and deep learning.

Programming

  • Start using github. It is excellent for version control and for sharing. Consider how many times you have written a script called analysis_v5_final_reallyfinal_thistime_final.py. With github you will just have analysis.py. With github, other researchers can replicate exactly what you did. This will ultimately save you time, if someone emails you for example.

  • If you write software for the use of the greater scientific community, it will be a lot easier for others to port your code and collaborate if you follow a standard set of guidelines when packaging your project for release (e.g. on github). Here is a template to follow written by Ariel Rokem.

  • A lot of open software that is developed for neuroscience runs on either Linux or OSX but not Windows. So consider installing Linux. Ubuntu is a popular distribution that has extensive support if you get stuck.

  • After installing Linux, learn the art of the command line

  • Do you use Matlab? It is worth considering a switch to Python. Python offers simpler syntax, enables system wide interfacing, is open source, free and for these reasons is being used by more and more scientists. Replication is far easier with Python than Matlab.

  • Now want to learn Python?

    • Here is a Python Bootcamp notebook that provides excellent advice on learning Python, written by Tom Donoghue.
    • Everyone in our lab started with Learn Python the Hard Way. 52 exercises spanning installing Python to building a web app.
    • Read the style guide for Python
    • Install Anaconda which is a scientific distribution of python that enables high performance computing and analysis.
    • Package your python project with this amazing guide by Vicki Boykis
    • Learn numpy (a package for scientific computing) with these 100 exercises written by Nicolas Rougier.
    • Become a python data ninja. Thomas Wiecki provides a great introduction to data science in python.
  • Use hotkeys for google, gmail, atom, jupyter notebooks & and bash. Consider a mechanical keyboard so your labmates love you, then hotkey some more.

  • Not sure how to code something? It may have an answer on stack overflow. Even professional programmers use stack overflow.

  • Access anything or anywhere on your computer with minimal effort using Keyboard launchers like Albert for linux and Alfred for mac.

  • Learn how to simulate data to ensure that your analysis works the way you think it does.

  • A basic understanding of data structures is useful for optimizing larger scale projects.

  • Need to sync files across your various lab computers/clusters and laptop you use at home and don't want to use Dropbox? Use rsync instead. e.g:

rsync -zavr -e ssh --delete --include '*/' --include='*include_these_files.[ext]' --exclude='*' [local_dir] [remote_server]:[remote_dir]

Generating publication quality figures

  • If you followed the programming advice above, you are now convinced that Python is your favorite language. Python has excellent data visualization built off matplotlib and a library called seaborn.

  • Use your plotting software of choice (e.g. seaborn) to get your figure as close to final as possible. Avoid having to make post-edits in illustrator/inkscape which can be a huge time sink as a graduate student.

  • Carefully consider the colors and colormaps of your figures. How would color blind readers interpret your figures?

  • Understand why people hate the jetmap colormap. Read about different colormaps here.

  • If you use Matlab, try out the gramm toolbox, inspired by R's ggplot2.

  • Save your figures in svg, or eps, not png.

Statistical analysis

Writing papers

  • See these Ten simple rules for structuring papers, written by Konrad Kording and Brett Mensh.

  • Publish your paper to one of the arXivs. If your PI doesn't support that, convince them.

  • If you are frustrated with writing, read this

  • Share your work with your friends as well as your enemies, the latter might give you even better criticism.

  • Steven Pinker has some interesting thoughts on how to make academic writing better

  • If you are struggling to write scientific papers in word, e.g. embedding equations, consider using Latex (pronounced "Lay-Tech"). Latex allows you to focus on writing rather than formatting.

Doing fMRI

  • Know your neuroanatomy. Julian Caspers, a neuroradiologist, provided a great set of guidelines at the 2017 Organization for Human Brain Mapping conference.

  • Standardize your imaging data set using the BIDS format - this will make your data more accessible to both your collaborators and the field at large.

  • As a benchmark, you should be able to write down the general linear model you are using from scratch and solve it in closed form.

  • Understand the difference between univariate and multivariate approaches to fMRI

  • Next learn representational similarity analysis & the related crossnobis distance measure, a powerful framework that can bridge behavior, imaging, and computational models.

  • You will need a visualization tool. A lot of labs have success with MRIcroGL or the connectome workbench. Recently James Gao written an indredibly powerful new tool called PyCortex which uses WebGL to render the flat maps and fiducial surfaces in your browser, you can even project movies on the surface.

  • Improve your understanding of anatomy with the web based user interface for exploring the human brain called Cortical Explorer.

  • Before you get really deep in your design, check out NeuroSynth (written by Tal Yarkoni) to run a meta analysis on your covariates of interest to see what has been done before.

Analyzing ephys data

  • Most ephys lab use in house analysis routines in (sometimes) relatively closed source and (oftentimes) expensive applications. Pavan Ramkumar @KordingLab has written an excellent open source package for spike data analysis and visualization in Python.

Biophysical and molecular modeling

  • Start here for a variety of software resources on realistic cellular, especially MCell and NEURON.

  • Check out CellBlender for visualization and simulation of realistic 3D cellular models.

  • Keep a digital lab notebook with Benchling, free for academics.

  • Try ApE for creating plasmid maps/visualization of restriction sites and planning experiments.

  • Recreate expensive hardware on the cheap with labrigger

  • Fiji is a free and easy to use image processsor.

Literature search

  • You will need a citation manager early on, PaperPile is a good one that is well integrated with Pubmed

  • Find articles before they are officially published on arxiv

  • You can search the literature with Pubmed & Google-scholar. Now is a good time to make your own google scholar account if you don't have one. Also, stay on top of your favorite authors' publications with Google-scholar's alerts.

  • Before you start down some major project that you will be committed to for years, understand the current literature in your topic. Understand very clearly why you are going to do what you are going to do.

  • Be skeptical of author's use of the word prediction, often what they really mean is in-sample linear correlation, and not what prediction actually means, out-of-sample generalization of a model. Here Tal Yarkoni provides some insights.

Grant writing

  • Be aware of what grants have been funded in your field, by searching nih reporter. This will tell you what was funded, the program officer, the PI, etc.

Twitter is a great resource for identifying new papers, events, tips etc.

Papers

Meetings

  • Never show up empty handed to meetings with your PI.

  • Have a clear objective to all meetings that everyone else knows as well.

  • Be able to show some evidence of your productivity.

  • You will have some days or weeks where nothing worked. I found that in those cases it is productive to have a "rainy day" folder containing interesting analyses/figures you have not yet shown.

Guides

  • How to pick a graduate advisor
  • Learn how to learn with this coursera course
  • Have a long look at this Survival Guide for PH.d students, by Andrej Karpathy, CS Ph.D, and the current director of AI at Tesla.
  • Ronald Azuma's retrospective on graduate school
  • Randy Pausch on time management
  • Know when and when not to say No
  • Find (neuro)hackathons in your area and go to them. You get to meet all kinds of people and produce something at an incredibly fast rate, thanks to the symbiosis of working on a team. Can be especially refreshing if you are spending years on your projects.

Science blogs

  • Mark Humphries will blow up your world at the Spike

Acknowledgments

Thanks to contributions from Ran Liu, Annie Homan, Rory Flemming, and Daniel Borek for making this page more useful.

labhacks's People

Contributors

danieltomasz avatar rmflemming avatar drranliu avatar

Watchers

 avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.