Git Product home page Git Product logo

madgrades.com's Introduction

madgrades-extractor

This project reads UW Madison grade distribution and course report PDF files (published by the UW Madison Office of the Registrar) and converts them into CSV or SQL dump files.

You will find published, update-to-date datasets at Kaggle.

https://i.imgur.com/9ZrwRMt.png

Conversion

The conversion process for a single term is as follows:

  1. Open DIR report for the term.

    a. Extract table from PDF (using tabula)

    b. Read each row, adding new section per row.

    c. Collate section info as necessary (i.e. 2 instructors for 1 single section)

    d. Collate courses which appear to be cross-listed (based on similarity between sections offered)

  2. Open grades report for the term.

    a. Extract table from PDF

    b. Read each row, add add each section grade data to course data added by the DIR report process

Typically all terms are extracted so this process repeats for each term.

Command Line Usage

Build it yourself with mvn clean install or grab a release from the releases page.

Usage: <main class> [options]
  Options:
    -d, -download
      Download the PDF reports instead of extracting data
      Default: false
    -e, -exclude
      Comma-separated list of term codes to exclude (ex. -e 1082)
    -f, -format
      The output format
      Default: CSV
      Possible Values: [CSV, MYSQL]
    -l, -list
      Output list of terms to extract
      Default: false
    -out, -o
      Output directory path for exported files (ex. -o ../data)
      Default: ./
    -t, -terms
      Comma-separated list of term codes to run (ex. -t 1082,1072)

Examples:

  • java -jar madgrades-final-1.0-SNAPSHOT.jar: will fetch every term and output files to the current directory
  • java -jar madgrades-final-1.0-SNAPSHOT.jar -t 1082: will fetch just term 1082
  • java -jar madgrades-final-1.0-SNAPSHOT.jar -o ../ -t 1082,1072: will fetch terms 1072 and 1082, output to ../

Relational Diagram

The CSV or SQL dumps are in the format of a collection of relational entities modeled something like this:

diagram

madgrades.com's People

Contributors

dependabot[bot] avatar thekeenant avatar yashg4509 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

madgrades.com's Issues

Add instructor searching

Contemplating this still. If you want to know more about a particular instructor, you can't do that right now unless you know all the courses they have taught. It would be nice to have an instructor page which shows:

  • Courses taught
  • Terms taught
  • Charts for the above two

To start with, simply adding instructor search which shows courses they have taught is suitable.

Section / instructor compare

Something that I've always wanted, and a feature that I think would be great for madgrades.com, is a section or instructor compare feature. For instance, if I needed to take CS 200 and there were multiple instructors, a deciding factor might be the average grade received for the average of when the instructors taught that class.

Currently, on the "Explore: Courses" page you can filter by instructor. That being said, filtering by "James Skrentny" shows the same Avg. GPA stat for CS 302 as filtering by "Deb Deppeler". A section / instructor compare feature could show all the instructors who taught the class and rank them by average GPA received.

SEO

Courses page wasn't rendered by Google bots, fixed by babel-polyfill in 34a9060. But there are more things to do:

  • Change title with page
  • Page headers
  • Body sections describing page info
  • Image alt tags

Structured text?

GradeDistribution Chart Left Margin Too Tight on Saved Image

When I click on the Save Image button, the image that gets downloaded has a left margin to be super tight with the left edge (click on image to see at full scale):

madgrades-2018-11-13t06_13_41 344z

I deduced the issue to the following line of code. By replacing the left margin from -15 to a larger number (i.e. 5), there would be a larger left margin which resolves the problem. Also, the top margin could also be increased as well.

<BarChart data={data} margin={{ top: 15, right: 5, left: -15, bottom: 20 }}>

madgrades-2018-11-13t06_32_18 024z

Course chart, unable to see percentages

Currently, I don't think it's possible to see the percent of students that received a certain grade on the course bar chart. You can follow the bar over to the y axis, but the number above each bar is the raw number of students and hovering over the bar does nothing. It would be nice to have some way to access the percent of students that got each grade. Thanks.

Course "metadata" & other ideas

Information such as the following could be scraped potentially, or community provided:

  • number of credits
  • Breadth (lit, humanities, social science, etc.)

Popular on the UW Madison subreddit are course recommendations. We could have a moderated comments section on courses with a voting system to rank courses. This would allow for better filtering for students (eg. I need a 2 credit humanities+lit course that people recommend). Comments would require specifying an instructor and semester, this would act as a popularity/trending metric of sorts.

GPA calculators

Some simple tools would be handy for students:

  • What is my GPA this semester?
  • How many credits would I need if I average X gpa?
  • What GPA would I need to average if I take N more credits? Or is it not possible?
  • Will I make the dean's list this semester?
  • What GPA would I need to average if I want X gpa after N credits?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.