Git Product home page Git Product logo

stat545-hw-hanrach's Introduction

Rachel Han

STAT 545A Homework repository

This repository contains the homework assignments for STAT545A -- a data exploration course in R. The course webpage can be found here.

Navigating through the repository

Click here to go to website of this repo

File links to each homework:

About Me

I'm a graduate student at UBC Mathematics, Vancouver.

  • I work on modelling Lithim ion batteries 🔋

  • I like nature 🌲 🌞 🌺

  • I look forward to data wrangling.

stat545-hw-hanrach's People

Contributors

hanrach avatar vincenzocoia avatar

Watchers

James Cloos avatar  avatar

Forkers

hanrach

stat545-hw-hanrach's Issues

TA feedback for hw04

Hi Rachel,

Excellent work!

Just a small comment on readability:

  • In a piping chain, write subsequent functions on separate lines:
# This is better
gapminder_wide %>% 
  pivot_longer(cols = -year,
                       names_to ="country",
                       values_to = "lifeExp") %>% 
  datatable()

# Instead of
gapminder_wide %>% pivot_longer(cols = -year,
                                names_to ="country",
                                values_to = "lifeExp") %>% datatable()

Peer Review for hw05

Peer-Review HW-05 for hanrach

Topic Excellent Satisfactory Needs Work
Coding style ✔️
Coding strategy ✔️
Presentation: graphs ✔️
Presentation: tables ✔️
Achievement, creativity ✔️
Ease of access ✔️ *

Remarks:

  • Coding style and strategy could be made more efficient by removing unnecessary elements. For example in exercise 2.1, you don't need to wrap the entire code in brackets.
  • I liked the theme of the document and the floating table of contents was a neat addition.
  • I learned you can plot graphs side-by-side using grid.arrange().
  • Please double check your graphs. I think the titles of the X- and Y-axis are reserved for the population and country graph. Also, when you juxtaposed the graphs side-by-side in exercise 4, the title for the graph on the right is cut off.
  • I think it would have been nice to see what packages were loaded at the start of the assignment so that others can use the same packages you used. For example, which package did grid.arrange() come from?
  • For exercise 5, I don't think the inclusion of here::here was necessary. In my testing, ggsave directly saves the .png into the hw05 folder.

Peer Review 02

Peer-Review HW-02 for Rachel Han

Topic Excellent Satisfactory Needs Work
Coding style ✔️
Coding strategy ✔️
Presentation: graphs ✔️
Presentation: tables ✔️
Achievement, creativity ✔️
Ease of access ✔️

Remarks:

Elaborating on Above

  • Coding style: Long code on a single line overflows the grey code box when rendered in .pdf and .html file formats. For instructions on how to correct this, scroll down to the Something You Struggled With section I wrote below.
  • Presentation: graphs: It is good practice to include titles in your plots (e.g. using ggtitle()). You also renamed some of your axis to be real labels, and kept some of the other ones as variable names. Consistency is key.
  • Presentation: tables: You piped some of your tables into kable() and some you didn't. Consistency is key. Be careful though, datatable() only works when rendering to .html.

Specific Praise

  • I really liked the colouring scheme of your plots-- it really made them pop, and made me actually want to look at them closer to see what data analysis was being conducted.

Something I Learned

  • Only .md files will allow scrolling in the grey code box
  • .pdf and .html files will just have the code overflow out of the grey code box
  • I learned about the existence of %in% and how to use it!

Specific Constructive Criticism

  • You can assign a variable and print to screen in one line by adding parentheses around the line. This helps increase readability of your code.
# Turn this:
three_countries <- filter(gapminder, country %in% c("Hong Kong, China","Canada","Korea, Rep."))

three_countries 
# Into this:
(three_countries <- filter(gapminder, 
  country %in% c("Hong Kong, China","Canada","Korea, Rep."))
)

Something You Struggled With

One of the neat aspects of R, is that it knows when you're not done with a command. One of the beauties of the pipe %>% is that you can write whatever follows on separate lines. You can also do this with commas and + signs.

  • For example, in Countries with a drop in life expectancy, you do this:
# Turn this:
gapminder_lifeExpChange <- gapminder %>%  group_by(country) %>% mutate(lifeExpChange = lifeExp - lag(lifeExp)) %>% drop_na()
# Into this:
gapminder_lifeExpChange <- gapminder %>%  
  group_by(country) %>% 
  mutate(lifeExpChange = lifeExp - lag(lifeExp)) %>% 
  drop_na()

Peer-Review HW-02 for hanrach


Topic Excellent Satisfactory Needs Work
Coding style ✔️
Coding strategy ✔️
Presentation: graphs ✔️
Presentation: tables ✔️
Achievement, creativity ✔️
Ease of access ✔️

Remarks:

  • Nice work overall! You showed a good command of dplyr and your outputs were for the most part easy to understand. Some graphs gave me ideas for how I can improve on my own work in the future as well.
  • The use of a different dataset added some flavor to the assignment and it was interesting seeing that a different set of colors was used. I wonder how dplyr decides what palette to use?
  • I didn't know that it's possible to plot different parameters on the same graph to determine a relationship between them, e.g. the life expectancy and population and GDP per capita graph in Exercise 3. I've learned that now thanks to you! :)
  • For ease of reading, I think it would be a good idea to break your R code into separate lines with 1-2 functions per line. At least, on my screen it was rendered as a single line. It would have been nice to present your output tables as tibbles too instead of having it just print to screen.
  • I guess it's been covered in class now but when you used geom_col in Exercise 2 it threw a warning saying "stat" is an unknown argument. "Stat" only works for geo_bar and it's used to override the default of that function to use frequency (so adding stat = "identity" will make it work pretty much the same as geom_col).

Peer-Review HW-01 for hanrach

Peer-Review HW-01 for hanrach

Topic Excellent Satisfactory Needs Work
Coding style ✔️
Coding strategy ✔️
Presentation: graphs ✔️
Presentation: tables ✔️
Achievement, creativity ✔️
Ease of access ✔️

Remarks:

  • You didn't have any graphs, but your presentation of tables was good!
  • Wow! Your file links to each homework on the README page is a great idea! Also LOVE the cat gif. And your slides looked great.
  • I hadn't encountered the cat() function before. I guess it's like print but you can add a comment on the front? I also didn't know that row names are displayed as numbers when converted to a tibble.
  • I would have liked to see at least one graph.
  • It doesn't seem like you struggled very much. I don't think I have anything to tell you about that you don't already know.

Peer-Review HW-03 for hanrach


Topic Excellent Satisfactory Needs Work
Coding style ✔️
Coding strategy ✔️
Presentation: graphs ✔️
Presentation: tables ✔️
Achievement, creativity ✔️
Ease of access ✔️

Remarks:

  • I like the overall narrative you have going throughout your assignment! It's almost like telling a story. Your analysis was comprehensive and fun to read.
  • Some of your comparison parameters were quite interesting. I can see some possible trends in your median population vs. life expectancy plot, for example. And I never would have thought of plotting the number of countries with low life expectancy per continent. It's so inspirational seeing how in-depth one can go with data analysis, and all the different ways people can use the same dataset to unearth all kinds of relationships.
  • There were quite a few things I learned about ggplot features:
    • Using scale_fill_distiller() and palettes, like the "Greens" you invoked on your histograms;
    • Creating smooth lines using 'loess' method within geom_smooth();
    • Plotting lines using geom_vline().
  • Minor note that since number of countries should always be discrete, you might not want to connect your points on that graph. Line plots imply continuous values.
  • For your histograms: since the count is already displayed in the y-axis, adding a color scale is a bit redundant (though it does make for a lovely graphic!). The higher, paler values can be a bit hard to discern.
  • The last graph can be a bit confusing without reading the accompanying text. I know a lot of us are guilty of skipping straight to the pretty colors and pictures before reading things thoroughly. Perhaps it would be helpful to add a caption on the graph?

HW04: Peer Review

Peer Review HW04 for Rachel

Topic Excellent Satisfactory Needs Work
Coding style ✔️    
Coding strategy ✔️  
Presentation: graphs ✔️   
Presentation: tables   ✔️  
Achievement, creativity ✔️    
Ease of access ✔️    

Comments:

  • Would look more neat if you rounded the values in your tables as it looks quite messy otherwise with many significant digits i.e. mutate(''= round(``)) actually gets rid of all the decimal points
  • For the table joins it also may of been good to explain why you chose to do that specific type (although not entirely necessary)
  • I liked the legend that you have on the left side of your html page, I didn't even think or know of adding something like that!
  • html page was visually simple and straightforward in regards to explaining

Peer Review HW 01

Peer-Review HW-01 for Rachel Han

Topic Excellent Satisfactory Needs Work
Coding style ✔️
Coding strategy ✔️
Presentation: graphs ✔️
Presentation: tables ✔️
Achievement, creativity ✔️
Ease of access ✔️

Remarks:

  • nice data exploration
  • slides compile nicely
    I see no issues!

TA feedback hw03

Hi @hanrach :),
here are a few comments about your assignment:

  • Please add a README file in your homework repo
  • There is no link to GitHub Pages
  • It is unclear which task are you tackling
  • The count fill is redundant in first and second plot
  • Give variables meaningful names
  • kable didn't render well in your .md document
  • Unnecessary group = in your plot code. This code is not very readable because you didn't add line breaks
  • You can hide messages and warnings from each code chunk
  • Your second task didn't include a Tibble
  • The difference between the vanilla and weighted mean is not clear in the last plot
  • Please clearly follow the assignment instructions! I appreciate you doing extra tasks but didn't fill the requirements for some of them

Peer Review 03

Peer-Review HW-03 for Rachel Han

Topic Excellent Satisfactory Needs Work
Coding style ✔️
Coding strategy ✔️
Presentation: graphs ✔️
Presentation: tables ✔️
Achievement, creativity ✔️
Ease of access ✔️

Remarks:

Elaborating Above

  • Maintain consistency among tables and graphs (some have titles, others don't; some tables have been processed using kable() or datatable(), others have not)

Some Specific Praise

  • I totally forgot about themes-- the theme combined with your table of contents looks really nice!
  • Kudos for choosing the more interesting options to plot, and investigating all the correlations

Something I Learned

  • A vertical line can be added to any plot using geom_vline()
  • Forgot that there was a way to do a line plot with points by using geom_line()
    then geom_point()
  • How to use geom_smooth()

Specific Constructive Criticism

  • Bring your plots to the next level by using ggtitle() (not all your plots have a title)
  • I know that the presentation of tables wasn't a huge part of this assignment, but it would really help with your pleasing aesthetic if you used datatable() or kable() (some of them were just printed to console)

Something You Should Know

  • Using theme(text = element_text(size=18)) you can change the overall font size of your labels (some of the years look really small and hard to read)
  • By using other aspects (i.e. instead of text =) you can manipulate the size of your axis titles and labels, etc.
  • For example (arbitrarily chosen numbers, of course):
theme(text = element_text(size=18),
          axis.text = element_text(size=16),
          axis.title = element_text(size=16))

Hw05 - peer review

Peer-Review HW-05 for hanrach

Topic Excellent Satisfactory Needs Work
Coding style ✔️
Coding strategy ✔️
Presentation: graphs ✔️
Presentation: tables ✔️
Achievement, creativity ✔️
Ease of access ✔️

Remarks:

  • Every step is explained in detail. In exercise 5, you noticed the fonts are small. This demonstrated that you paid close attention to the output of your work, not rushed to get it done.
  • I learned how to make a table of contents that can stay on the side of the page.
    -In your Population by country plot, I think the titles of axes were in the wrong order. Population should be on the x-axis, and the y-axis should be countries.

Peer review hw04 for Rachel Han

Peer-Review HW-04 for Rachel Han

Topic Excellent Satisfactory Needs Work
Coding accuracy ✔️
Visual style ✔️
Tidy submission ✔️
HTML submission ✔️

Remarks:

  • Wow – nice work on fancy table of contents.
  • Oooh, dang… Anti-join. I forgot about that one.
  • Really nice coding!!
  • I may have copied out your answers for future reference :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.