Git Product home page Git Product logo

Comments (14)

bensoltoff avatar bensoltoff commented on August 11, 2024 1

Draw examples from Kieran Healy's socviz package.

  • opiates
  • gss_sm
  • gss_lon

from course-site.

bensoltoff avatar bensoltoff commented on August 11, 2024 1

Flagging @YinsuH on this. She's working as an RA for me this summer through SISRM.

from course-site.

YinsuH avatar YinsuH commented on August 11, 2024 1

This is the website I looked up with a few use of scatterplot. I was not sure if my concern was significant, so I decided to bring it up anyways :)

from course-site.

bensoltoff avatar bensoltoff commented on August 11, 2024 1

@YinsuH I think we'd be okay with the number of observations. But I agree with your concerns about an appropriate number of categorical variables for some of the examples. Especially I am thinking about computer programing as problem solving. Could you take a stab at rewriting the examples in the notes folder that currently use diamonds, but substituting with the penguins dataset?

I think the easiest workflow will be to fork the course-site repo, then edit the .Rmarkdown files directly. Note that if you try to build the entire site, you will need to knit all the R Markdown files in the repo which will take some time (and probably require you have additional packages installed). If it's easier, just write a fresh .R script for each page that uses diamonds and just rework the code. I/we can update the written narrative later once we know the examples work.

from course-site.

bensoltoff avatar bensoltoff commented on August 11, 2024

Police shootings dataset

from course-site.

bensoltoff avatar bensoltoff commented on August 11, 2024

Need to replace

  • diamonds
  • mtcars
  • mpg
  • Auto

Questionable datasets

  • titanic
  • flights
  • gapminder

from course-site.

bensoltoff avatar bensoltoff commented on August 11, 2024

Deadest names - see #115

from course-site.

deblnia avatar deblnia commented on August 11, 2024

Still working on this, specifically with Movies and Snapchat data. (Repo is very minimal now.) Will also look into socviz, deadest names and police shootings.

Two other options, would love to hear what you think-- palmer penguins instead of diamonds, and recent-ish O'Hare/Midway data using anyflights instead of flights.

EDIT: Also flagging Damon Jones' scrape of UCPD stops as a potential alternative to the WaPo Police shooting dataset.

from course-site.

bensoltoff avatar bensoltoff commented on August 11, 2024

Palmer penguins is supposed to be a good drop-in replacement for iris. Not sure if it contains sufficient variables to replace diamonds. We'd need to check how diamonds is used on the website to verify the penguins dataset contains appropriate variables.

Chicago flights data would be nice to replace nycflights13, though I think I only use it for one set of exercises for relational joins.

from course-site.

YinsuH avatar YinsuH commented on August 11, 2024

I have looked at the penguin dataset and the lecture notes. I would say the penguin data is viable in terms of most of the operations we need. For instance, it could be used for practicing pipe and writing functions. However, one problem I think might be significant about penguin data is that it contains only 344 observations, while diamonds has more than 20k observations. In the exercise we use characteristics like color and cut, both of which have more than 5 kinds. But the qualitative variables, species and island, in penguins only have three different possible entries. This fact to some extent signifies the lack of variability in the penguin data, and thus might lead to some problems in modeling and make the data visualization less diverse than figures produced by diamonds.

from course-site.

deblnia avatar deblnia commented on August 11, 2024

I don't think we use diamonds in any modelling pages (feel free to correct me if I'm wrong, just searched diamonds on the website), so I don't think the sample size should be disqualifying. The lack of levels for categorical variables is definitely a valid concern though.

from course-site.

YinsuH avatar YinsuH commented on August 11, 2024

@bensoltoff I have created a pull request for the course site. However, this is my first attempt in updating the website and some of the work might still have problems. I will continue checking them in the next few days. Also, I have written a few questions I got in the pull request post. Please have a look.

from course-site.

bensoltoff avatar bensoltoff commented on August 11, 2024

Household Pulse Survey - assess impact of COVID-19 on households

from course-site.

bensoltoff avatar bensoltoff commented on August 11, 2024

Need to replace

  • diamonds
  • mtcars
    • Need a fully numeric data frame to drop into iteration exercises. Or need to rewrite that exercise
  • mpg
  • Auto

Questionable datasets

  • titanic
  • flights
  • gapminder

from course-site.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.