Comments (7)
⌨️ Activity: Switch to a new branch
Before you edit any code, create a local branch called "three-states" and push that branch up to the remote location "origin" (which is the github host of your repository).
git checkout master
git pull origin master
git checkout -b three-states
git push -u origin three-states
The first two lines aren't strictly necessary when you don't have any new branches, but it's a good habit to head back to master
and sync with "origin" whenever you're transitioning between branches and/or PRs.
Comment on this issue once you've created and pushed the "three-states" branch.
from ds-pipelines-3.
a
from ds-pipelines-3.
⌨️ Activity: Explore the starter pipeline
Without modifying any code, start by inspecting and running the existing data pipeline.
- Open up remake.yml and read through - can you guess what will happen when you build the pipeline?
- Build all targets in the pipeline.
- Check out the contents of
oldest_active_sites
.
💡 Refresher hints:
- To build a pipeline, run
library(scipiper)
and thenscmake()
. - To assign an R-object pipeline target to your local environment, run
mytarget <- scmake('mytarget')
. This function will check/build the object to make sure it's up-to-date before passing it to you. - If you don't want scipiper to check for currentness or rebuild first, run
mytarget <- remake::fetch('mytarget')
. This function is faster to run and can be handy for getting the old value of an object before you rebuild it. - You'll pretty much always want to call
library(scipiper)
in your R session while developing pipeline code - otherwise, you need to callscipiper::scmake()
in place ofscmake()
anytime you run that command, and all that extra typing can add up.
When you're satisfied that you understand the current pipeline, include the value of oldest_active_sites$site_no
and the image from site_map.png in a comment on this issue.
Add a comment to this issue to proceed.
from ds-pipelines-3.
[1] "04073500" "05211000" "04063522"
from ds-pipelines-3.
⌨️ Activity: Spot the split-apply-combine
Hey, did you notice that there's a split-apply-combine action happening in this repo already?
Check out the find_oldest_sites()
function:
find_oldest_sites <- function(states, parameter) {
purrr::map_df(states, find_oldest_site, parameter)
}
This function:
- splits
states
into each individual state - applies
find_oldest_site
to each state - combines the results back into a single
tibble
and it all happened in just one line! The split-apply-combine operations we'll be exploring in this course require more code and are more useful for slow or fault-prone activities, but they follow the same general pattern.
Check out the documentation for map_df
at ?purrr::map_df
or online here if this function is new to you.
When you're ready, comment again on this issue.
from ds-pipelines-3.
a
from ds-pipelines-3.
⌨️ Activity: Apply a downloading function to each state
Awesome, time for your first code changes ✏️.
-
Write three scipiper targets in remake.yml to apply
get_site_data()
to each state instates
. The targets should be namedwi_data
,mn_data
, andmi_data
. -
Modify the
sources
section of remake.yml as needed to make your pipeline executable. -
Modify the
main
target so that your new targets will be built by default. -
Test it: You should be able to run
scmake()
with no arguments to get everything built.
💡 Hint: the get_site_data()
function already exists and shouldn't need modification. You can find it by browsing the repo or by hitting Ctrl-. in RStudio and then searching for "get_site_data".
When you're satisfied with your code, open a PR to merge the "three-states" branch into "master". Make sure to add .remake
, 3_visualize/out
, and any .DS_Store files to your .gitignore
file before committing anything. In the description box for your PR, include a screenshot or transcript of your console session where the targets get built.
I'll respond in your new PR. You may need to refresh the PR page to see my response.
from ds-pipelines-3.
Related Issues (8)
- Recognize the unique demands of data-rich pipelines HOT 2
- Task tables HOT 11
- Splitters HOT 5
- Appliers HOT 9
- Combiners HOT 15
- Scale up HOT 9
- What's next
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ds-pipelines-3.