sisbid / data-wrangling Goto Github PK
View Code? Open in Web Editor NEWTeaching material for Summer Institute in Statistics for Big Data Module 1.
Home Page: http://sisbid.github.io/Data-Wrangling/
License: Other
Teaching material for Summer Institute in Statistics for Big Data Module 1.
Home Page: http://sisbid.github.io/Data-Wrangling/
License: Other
I feel like we might want to mention this somewhere...
maybe in the summarize lecture? an example kinda like with the last lab question?
or do we want a module?
some students needed this
Lab question 4 appears to be missing part of the instructions
could be nice to do this - maybe recaps
Lab 1
Then part2 could then focus on using paths special imports and output?
Maybe for part one just use the urls if we did that
South_West <- ufo_clean %>% filter(state %in% c("tx", "nm", "ut")) %>%
mutate(recode(state, "Texas" = "tx",
"New_Mexico" = "nm",
"Utah" = "ut"))
South_West
Need to reverse new and old names
people struggled with authentication
question 2 missing word of - number of boardings
Easier than ifelse for nas
make more general, less Bioc/SRA
CONFUSING
Appears to be a bug here. @carriewright11 you mentioned a problem others might be having - could we link to it here?
people often ask about this... maybe we should add an example
seems to mostly be about naniar.. but now that will come later... so probably want to practice the data io stuff more?
Split up some of Jeff's google slides into smaller pieces
remove parameterized reports
reshaping lab says to use pivot_wider instead of pivot_longer for Q2. For cleaning lab add hint and explanation about str_detect vs str_subset
what if we made a single cheatsheet for the class? (or we could make multiple cheatsheets.
This is just an icing on the cake idea :P
data_As <- tibble(State = c("Alabama", "Alaska", "Alaska"),
state_bird = c("wild turkey", "willow ptarmigan", "puffin"))
data_cold <- tibble(State = c("Maine", "Alaska", "Alaska"),
vacc_rate = c("32.4%", "41.7%", "46.2%"),
month = c("April", "April", "May"))
maybe we would want to add this?
add alternative:
long %>% mutate(
var = str_replace(var, "Board", "_Board"),
var = str_replace(var, "Alight", "_Alight"),
var = str_replace(var, "Average", "_Average")
)
Need more details on this
question number 6 in the key and lab says 5
equal sign for assignment on one of the first slides
last question to should be two in comment
we describe getting packages on anvil... do we want to?
In the subsetting lab part 1:
mpg %>% filter(displ > 4 & drv == 4)
mpg %>% filter(displ > 4 & drv == "4")
Both of these yield the same results. Might be worth some explanation that as numbers can be quoted or not, regardless of the column data class
http://sisbid.github.io/Module1/
Material covered in Day 2 are incorrectly labeled as Day 1
https://gitprint.com/SISBID/Module1/blob/gh-pages/lecture_notes/Bioconductor_intro.md
(starting on page 26)
The printed file output the entire dataset into the document. Perhaps there is a way to set the "eval" to FALSE for markdown or shorten the output?
In getting_started.md, the link for RStudio online learning extra materials (http://www.rstudio.com/resources/training/online-learning/) is broken. I think you're looking for either https://education.rstudio.com/ or https://education.rstudio.com/learn/
This doesn't appear to list the helpers. I think an alternative is needed here.
accidentally have something about all the packages apparently:
The following file is not found in the "This file" link (https://jhudatascience.org/intro_to_r/resources/all_the_packages.txt) of the R and RStudio Installation html page.
Maybe this should happen after the functional programming module, where lists are introduced?
Also can maybe update to show how you could use purrr instead of the for loop
I put the bike data in the missing data lecture... don't know if we introduce this package earlier or not
maybe add this to joining, merging material?
at least 2 and no larger than g should be 6
should say use iris_lab data at the step 1
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.