Software development and data analysis
Instructor: Ariel Rokem, The University of Washington eScience Institute
[email protected]
Contact:The goals of this short course are to familiarize participants with tools and practices used in data science to analyze tabular datasets (such as administrative datasets). The course will focus on technical aspects of the work, but we will also discuss communicating about data (data visualization) and statistical methodology (machine learning).
Collaborative notes
Schedule
Day 1
9 AM - noon : Software revision control with git and collaboration with Github
noon - 1 PM : lunch
1 PM - 4:30 PM : Programming in Python
Day 2
9 AM - noon : Programming with Python (continued)
noon - 1 PM : lunch
1 PM - 3:30 PM : Machine learning Materials
3:30 PM - 4:30 PM : Putting it all together: the data science workflow
Preparation
Learning from this course a strong component of hands-on experience, so we would like every one of the learners to be able to try things out on their own computer. Please bring a laptop with you. We will use a browser to log into a Jupyterhub website that we will set up for the course, so no additional softare is required. If you do not have access to a laptop please let Ariel know as soon as possible, so that we can arrange a loaner laptop for you.
As preparation for the course, please create a Github account. You will need this account to log into the system. Please send your user-name to Ariel ASAP