Git Product home page Git Product logo

bsds100's Introduction

BSDS 100: Intro to Data Science with R

Abbie M. Popa

Email: [email protected]

Class Time: TR, 2:40 - 4:25 PM in Harney 430

Office Hours: TR, 1:20 - 2:20 PM in Harney 107B (James Wilson's Office)

Book: R for Data Science by Hadley Wickham and Garret Grolemund

Syllabus: Link

Course Learning Outcomes

By the end of this course, you will be able to

  • Proficiently wrangle, manipulate, and explore data using the R programming language
  • Use contemporary R libraries including ggplot2, tibble, tidyr, dplyr, knitr, and stringr
  • Visualize, present, and communicate trends in a variety of data types
  • Communicate results using R markdown and R Shiny
  • Formulate data-driven hypotheses using exploratory data analysis and introductory model building techniques

Course Overview

Assessment

The focus of this course will be to provide you with the basic techniques available for making informed, data-driven decisions using the R programming language. This is not a statistics course, but will provide you the intuition to make hypotheses about complex questions through visualization, wrangling, manipulation, and exploration of data. The course will be graded based on the following components:

  • Attendence (20%): Attendance will be recorded and you will lose points for every class you miss.
  • Assignments (40%): You will be assigned a computational assignment to be completed using RStudio and the package knitr regularly throughout class.
  • Case Studies (20%): You will be assigned applied case studies throughout the class that are to be completed using RStudio.
  • Final Project (20%): The final project will be a computational case study that brings together the techniques learned throughout the semester. The description for this project will be provided towards the mid point of the semester.

Schedule

I will do my best to keep this schedule accurate and up to date. However, I reserve the right to change it as I deem necessary. Usually this will be due to the amount of material we are able to cover in class.

If you wish to view the notes I use during lecture you can see them here, though note I often change these based on class questions.

Introduction

Topic Reading Assignment Due Date In Class Code
Introduction - History of Data Science Ch. 1 What is Data Science? HW 1 Thursday, 8/23 Installing R, RStudio, and LaTeX
R and RStudio HW 2 Tuesday, 8/28 In Class Code 2018-08-23
R Packages and RMarkdown HW 3 Tuesday, 9/4 In Class Activity
In Class Activity Solution: Rmd Code
In Class Activity Solution: PDF Output
Class Code - Packages
Class Code - R File to PDF
Class Output - R File to PDF
Class Code - Rmd File to PDF
Class Output - Rmd File to PDF
Class Activity 2

Data Structures in R

Topic Reading Assignment Due Date In Class Code
Vectors, Matrices, and Arrays HW 4 Tuesday, 9/11 Class Code Aug 30, 2018
Class Code Sept 4, 2018
Coding Challenge
Coding Challenge Answer Key
Real World Examples
Class Code Sept 6, 2018
Lists and Data Frames Ch. 20 in R for Data Science Class Code Sept 11, 2018
Coding Challenge
Tibbles Ch. 10 in R for Data Science HW 5 Tuesday, 9/25 Tibbles versus Data Frames Activity
Class Code Sept 13, 2018
Lecture Qs Sept 13, 2018
Class Code Sept 18, 2018
Tibbles versus Data Frames Activity Answer Key
Strings and Factors Ch. 14.1 - 14.2 and 15 in R for Data Science Class Code 180920
Class Code 180925

Ethics in Data Science

Topic Reading Assignment Due Date In Class Code
Ethics in Data Science

Data Wrangling and Plotting

Topic Reading Assignment Due Date In Class Code
Input and Output HW 6 Thursday, 10/18 Factor and String Lab - Rmd
Factor and String Lab - PDF
Tree Data
Question Data
Class Code
singles data
triples data
Plotting in R Plotting Lab as .Rmd
Plotting Lab as .PDF
Class Code 181009
Class Code 181011
Class Code 181018
Wrangling Data with tidyr Ch. 12 in R for Data Science Class Code 181025
Wrangling Data with dplyr - I Class Code 181030
Wrangling Lab - Rmd
Wrangling Lab - PDF
Wrangling Relational Data with dplyr HW 7 Tuesday, 11/13 Join Lab - Rmd
Join Lab - PDF
Class Code 181106
String Analysis Ch. 14.3 - 14.7 in R for Data Science Class Code 181108

Programming

Topic Reading Assignment Due Date In Class Code
Control Flow Ch. 21 in R for Data Science HW 8 Tuesday 11/27 Class Code 181113
Writing Functions Ch. 19 in R for Data Science Function Lab - Rmd
Function Lab - PDF
Class Code 181127
Class Code 181129

Other

Topic Reading Assignment Due Date In Class Code
Extra Review

DS in the Wild

Example
Song Lyrics

Case Studies

Case Study Data In-Class Date Due Date Notes
CS 1 Ramen Reviews September 25th, 2018 October 9th, 2018 Case Study 1 Notes
CS 2 hour data
day data
October 23, 2018 November 8, 2018

Final Project

Description Due Date Notes
Project Sign-Up will be through a google doc link on Canvas November 1st at 9 AM
Final Project Description - UPDATED due to smoke December 7 at 11:59 PM Final Tips and Tricks
Report Tips
Presentation Tips

Important Dates

  • Monday, August 27th - Last day to add the class
  • Friday, September 7th - Census date. Last day to withdraw with tuition reversal
  • Tuesday, October 16th - Fall break! (no class)
  • Friday, November 2nd - Last day to withdraw
  • Thursday, November 22nd - Thanksgiving Holiday (no class)
  • Thursday, December 7 - Final Projects Due
  • Tuesday, December 4th - Last day of class

bsds100's People

Contributors

abbiepopa avatar jdwilson4 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.