Git Product home page Git Product logo

group054_wi24's Introduction

This is your group repo for your final project for COGS108.

This repository is private, and is only visible to the course instructors and your group mates; it is not visible to anyone else.

Template notebooks for each component are provided. Only work on the notebook prior to its due date. After each submission is due, move onto the next notebook (For example, after the proposal is due, start working in the Data Checkpoint notebook).

This repository will be frozen on the final project due date. No further changes can be made after that time.

Your project proposal and final project will be graded based solely on the corresponding project notebooks in this repository.

Template Jupyter notebooks have been included, with your group number replacing the XXX in the following file names. For each due date, make sure you have a notebook present in this repository by each due date with the following name (where XXX is replaced by your group number):

  • ProjectProposal_groupXXX.ipynb
  • DataCheckpoint_groupXXX.ipynb
  • EDACheckpoint_groupXXX.ipynb
  • FinalProject_groupXXX.ipynb

This is your repo. You are free to manage the repo as you see fit, edit this README, add data files, add scripts, etc. So long as there are the four files above on due dates with the required information, the rest is up to you all.

Also, you are free and encouraged to share this project after the course and to add it to your portfolio. Just be sure to fork it to your GitHub at the end of the quarter!

group054_wi24's People

Contributors

leoflur avatar dodecadonk avatar ivannchenn avatar shanellis avatar feyaaaa avatar

Stargazers

 avatar

group054_wi24's Issues

Project Proposal Feedback

Project Proposal Feedback

Score (out of 9 pts)

Score = 9

Feedback:

Quality Reasons
Abstract NA
Research question D This could use some work to make it clearer. I suggest something like "Do the factors 1. primary revenue source and 2. historical political leanings influence a states "self-sufficiency", as defined by amount received the state has received from the Federal Government for disaster relief." I do wonder if what you a primarily interested in is a states GDP? As in your background you state "our project takes a different angle by using GDP as a measure of a states self sufficiency and seeing how that correlates to how much disaster relief aid that state get " I'd imagine using GDP as a variable would then be interesting (you could still use primary revenue source as a variable to look into as well.
Background P
Hypothesis P Good just make it a bit clearer: "We predict X, Y are associated with an increase in federal funding disaster relief amount (normalized for state GP). We predict W, Z are associated with a decrease in ..."
Data D Missing primary industry of income, in the ideal. Also what time frame would be ideal to answer this question? I would imagine since the beginning of the USA. Worth being clearer on "The amount of [federal?] money spent on disaster relief by [to?] each state"
Ethics D "When considering the ethical implications of our project, we believe that we have conducted it in a legal and fair manner". But you have not conducted the analysis yet! What if the result is harmful to a particular demographic? Consider the unintended consequences of such an analysis
Team expectations P
Timeline P

Rubric

Unsatisfactory Developing Proficient Excellent
Abstract The abstract is confusing or fails to offer important details about the issue, variables, context, or methods of the project. The abstract lacks relevance or fails to offer appropriate details about the issue, variables, context, or methods of the project. The abstract is relevant, offering details about the research project. The abstract is informative, succinct, and clear. It offers specific details about the educational issue, variables, context, and proposed methods of the study.
Research question The research issue remains unclear. The research purpose, questions, hypotheses, definitions variables, and controls are still largely undefined, or when they are poorly formed, ambiguous, or not logically connected to the description of the problem. Unclear connections to the literature. The research issue is identified, but the statement is too broad or fails to establish the importance of the problem. The research purpose, questions, hypotheses, definitions or variables, and controls are poorly formed, ambiguous, or not logically connected to the description of the problem. Unclear connections to the literature. Identifies a relevant research issue. Research questions are succinctly stated, connected to the research issue, and supported by the literature. Variables and controls have been identified and described. Connections are established with the literature. Presents a significant research problem. Articulates clear, reasonable research questions given the purpose, design, and methods of the project. All variables and controls have been appropriately defined. Proposals are clearly supported by the research and theoretical literature. All elements are mutually supportive.
Background Did not have at least 2 reliable and relevant sources. Or relevant sources were not used in relevant ways A key component was not connected to the research literature. Selected literature was from unreliable sources. Literary supports were vague or ambiguous. Key research components were connected to relevant, reliable theoretical and research literature. The narrative integrates critical and logical details from the peer-reviewed theoretical and research literature. Each key research component is grounded in the literature. Attention is given to different perspectives, threats to validity, and opinion vs. evidence.
Hypothesis Lacks most details; vague or interpretable in different ways. Or seems completely unrealistic. A key detail to understand the hypothesis or the rationale behind it was not described well enough The hypothesis is clear. All elements needed to understand the rationale were described in sufficient detail The hypothesis and its rationale were described succinctly and with clarity about how they are connected to each other
Data Did not describe ideal dataset fully AND does not include at least one reference to an external source of data. Either does not describe the ideal dataset fully AND does not include at least one reference to an external source of data that could be used to answer the proposed question. Ideal dataset(s) well-described and includes everything needed for answering question(s) posed. Includes at least one reference to a source of data that would be needed to fully answer the question proposed. Ideal dataset(s) well-described and includes everything needed for answering question(s) posed. Includes references to all sources of data that would be needed to fully answer the question proposed. The details of the descriptions also make it clear how they support the needs of the project and discuss the differences betweeen the ideal and real datasets.
Ethics No effort or just says we have no ethical concerns Minimal ethical section; probably just talks about data privacy and no unintended consequences discussion. Ethical concerns raised seem irrelevant. The ethical concerns described are appropriate and sufficient. Ethical concerns are described clearly and succinctly. This was clearly a thorough and nuanced approach to the issues
Team expectations Lack of expectations The list of expectations feels incomplete and perfunctory It feels like the list of expectations is complete and seems appropriate The list clearly was the subject of a thoughtful approach and already indicates a well-working team
Timeline Lack of timeline. Or timeline is completely unrealistic The timeline feels incomplete and perfunctory. The timeline feels either too fast or too slow for the progress you expect a group can make It feels like the timeline is complete and appropriate. it can likely be completed as is in the available amount of time The timeline was clearly the subject of a thoughtful approach and indicates that the team has a detailed plan that seems appropriate and completeable in the allotted time.

Scoring: Out of 9 points

  • Each Developing => -0.75 pts
  • Each Unsatisfactory/Missing => -1.5 pts
    • until the score is 0

If students address the detailed feedback in a future checkpoint, they will earn these points back.

Comments

Project Checkpoint Feedback

Project Checkpoint Feedback

Score (out of 5 pts)

Score = 5

Data Checkpoint Feedback

Quality Reasons
Data relevance E
Data description E
Data wrangling P

Comments

Proposal Regrade Feedback

Good, full points awarded.
comment: # (For TA/IA: Please put your proposal regrade feedback here, and change the score on the previous issue. Feel free to Copy & Paste the table from the proposal grading or just explain in plain words.)

Rubric

Unsatisfactory Developing Proficient Excellent
Data relevance Did not have data relevant to their question. Or the datasets don't work together because there is no way to line them up against each other. If there are multiple datasets, most of them have this trouble Data was only tangentially relevant to the question or a bad proxy for the question. If there are multiple datasets, some of them may be irrelevant or can't be easily combined. All data sources are relevant to the question. Multiple data sources for each aspect of the project. It's clear how the data supports the needs of the project.
Data description Dataset or its cleaning procedures are not described. If there are multiple datasets, most have this trouble Data was not fully described. If there are multiple datasets, some of them are not fully described Data was fully described The details of the data descriptions and perhaps some very basic EDA also make it clear how the data supports the needs of the project.
Data wrangling Did not obtain data. They did not clean/tidy the data they obtained. If there are multiple datasets, most have this trouble Data was partially cleaned or tidied. Perhaps you struggled to verify that the data was clean because they did not present it well. If there are multiple datasets, some have this trouble The data is cleaned and tidied. The data is spotless and they used tools to visualize the data cleanliness and you were convinced at first glance

Grading Rules

Scoring: Out of 5 points

Each Developing => -1 pts
Each Unsatisfactory=> -2 pts
until the score is 0

If students address the detailed feedback in a future checkpoint they will earn these points back

DETAILED FEEDBACK should be left in the data section AND anywhere the student addressed proposal feedback but did not do it to your satisfaction

EDA Checkpoint Feedback

EDA Checkpoint Feedback

Score (out of 5 pts)

Score = 5

EDA Checkpoint Feedback

Quality Reasons
EDA Relevance E
EDA Analysis and Description E
EDA Figures E

Comments

Good job!

Regrade Feedback

Rubric

Unsatisfactory Developing Proficient Excellent
EDA relevance EDA is mostly neither relevant to the question nor helpful in figuring out how to address the question. Or the EDA does address the question, but many obviously relevant variables / analyses / figures were not included. EDA does not include explore distributions of single variables or relationships between variables or both EDA is partly irrelevant/unhelpful. Or some obviously relevant variables / analyses / figures were not included. EDA does not include a few distributions of single variables or relationships between variables EDA is almost all relevant / helpful in addressing the question. No obviously relevant variables / analyses / figures were not included. Thorough EDA addressed all aspects that are relevant to the question
EDA analysis and description Many of the analyses are poor choices (e.g., using means instead of medians for obviously skewed data), or are poorly described in the text, or do not aid understanding the data Some of the analyses are poor choices, or are poorly described in the text, or do not aid understanding the data All analyses are correct choices. Only one or two have minor issues in the text descriptions supporting them. Mostly they fit well with other elements of the EDA and support understanding the data All analyses are correct choices with clear text descriptions supporting them. The figures fit well with the other elements of the EDA, producing a clear understanding of the data.
EDA figures Many of the figures are poor plot choices (e.g., using a bar plot to represent a time series where it would be better to use a line plot) or have poor aesthetics (including colormap, data point shape/color, axis labels, titles, annotations, text legibility) or do not aid understanding the data Some of the figures are poor plot choices or have poor aesthetics. Some figures do not aid understanding the data All figures are correct plot choices. Only one or two have minor questionable aesthetic choices. The figures mostly fit well with the other elements of the EDA and support understanding the data All figures are correct plot choices with beautiful aesthetics. The figures fit well with the other elements of the EDA, producing a clear understanding of the data.

Grading Rules

Scoring: Out of 5 points

Each Developing => -1 pts
Each Unsatisfactory=> -2 pts
until the score is 0

If students address the detailed feedback in a future checkpoint they will earn these points back

DETAILED FEEDBACK should be left in the data section AND anywhere the student addressed proposal feedback but did not do it to your satisfaction

Your Assigned Grader

Hi group 54, your assigned grader is Matthew Feigelis.

The person will be responsible for reading, grading, and providing feedback on your final project throughout the entire quarter. If you and your teammates have project-specific questions, it is strongly recommended that you reach out to the assigned grader and/or attend their office hours, as they will have the deepest understanding of your project.

Final Project Grade

Final Project Grade

Score (out of 15 pts)

Score = 14.5

Rubric

Notes: Checked checkbox means that you earned this rubric item. Uncheck box indicate you did not (fully) earned this rubric item. See the Notes for details.

  • Overview (0.5 pt):

    • Write a clear overview of the project including results (0.25pts)
    • Include - Names/Title (0.25)
    • Notes:
  • Research Question (0.5 pt):

    • Include a specific, clear data science question relevant to the scope of the course. (0.25)
    • Variables needed to answer the question are clear (0.25)
    • Notes:
  • Background & Prior Work and Hypothesis (0.5 pts):

    • Cite and explain the work done previously and how you used info from the same. (0.25)
    • Relevant/Cogent hypothesis included and explained clearly (0.25)
    • Notes:
  • Data Description/ Datasets (0.75 pts):

    • Datasets clearly stated and source, links, No. of observations, nature of observations (0.5)
    • Description of data attributes and dataset provided (0.25)
    • Notes:
  • Data Cleaning/Processing (0.75 pts):

    • Cleaning procedure/requirement needs to be shown/stated and if no cleaning required that too should be stated along with reason. (0.5)
    • Cleaned data should be demonstrated (0.25)
  • Data Visualizations (3 pts): minimum of 3 viz required; divide points by N visualizations

    • Plots that make sense & give useful information (2)
    • Figure explains itself on axes/legend/caption OR text surrounding explains it (1)
  • Data Analysis and Results (4.5 pts) divide points by N analyses performed

    • Analysis chosen was appropriate to answer research question (1.5 pts)
    • Analysis was performed in a technically correct manner (1.5 pts)
    • Output of analysis interpreted and interpretation included in notebook (1.5 pts)
    • Notes:
  • Privacy/Ethics Considerations (1.5 pts):

    • Thoughtful discussion of ethical concerns. NB: bare minimum examine potential unintended consequences of work and sources of bias (1 pts)
    • Ethical concerns consider the whole data science process (question asked, data collected, data being used, the bias in data, analysis, post-analysis, etc.) (0.25 pts)
    • How your group handled bias/ethical concerns clearly described (0.25 pts)
    • Notes:
  • Conclusion & Discussion (1.5 pts):

    • Clear conclusion (answer to the question being asked) and discussion of results (1 pts)
    • Limitations of analysis discussed (0.5 pts)
    • Does not ramble on beyond providing necessary information
  • Documentation/ Written Communication (1.5 pts):

    • Code errors / code purpose not clear in comments or surrounding text (0.75 pts)
    • Narrative structure of the report: (0.75pts)
      • The report clearly supports the main points of the conclusion
      • Doesn’t go down a bunch of side streets that aren’t important
    • Notes:

Comments

good job overall

Regrade from Previous Checkpoints

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.