Git Product home page Git Product logo

yohanjeong_portfolio's Introduction

YohanJeong_Portfolio

This Portfolio contains data-related projects.

Medium Post

  • Scraped the Boxofficemojo website using Scrapy in Python.
  • Checked all the movies released in the US during certain periods of time and extracted useful information about the individual movies.
  • For each movie, Domestic Revenues, Worldwide Revenues, Distributor, Opening, Budget, MPAA, Genres, and In Release are scraped.

Medium Post

  • Scraped the Boxofficemojo and Traileraddict websites to get movie information.
  • Explored movie features such as budget, distributors, MPAA, and genres.
  • Examined whether the variation in the promotion period is related to such features.

Medium Post

  • Analyze the factors related to housing prices in Melbourne and performed the predictions for the housing prices using several machine learning techniques.
  • Employed Linear Regression, Ridge Regression, K-Nearest Neighbors, and Decision Tree.
  • Found the optimal values for hyper parameters in each model using the methods of the Cross Validation and Grid Search techniques.
  • Compared the results to find the best machine learning model to predict the housing prices in Melbourne.

Medium Post 1 / Medium Post 2

  • Converted a data in one spreadsheet to a relational database for SQL.
  • Performed several SQL queries using the database.
  • Scraped over 3000 job postings for 'Data Analyst' from the Glassdoor website using the Selenium library in the Python
  • Cleaned the scraped data using the Python.
  • Converted the data to the format for the Relational Database to store it in the SQL format.
  • Visualized the data using Tableau, showing the salary distributions by state, city, sector, and skills.

  • Implemented the cohort analysis using eCommerce data from UIC machine learning repository
  • Showed how to create the matrix for cohort analysis from the raw ecommerce data.

Medium Post

  • Used a movie data set from the MovieLens, which has 9742 movies.
  • Quantified the movie features using the Term Frequency and Inverse Document Frequency (tf-idf).
  • Calculated the similarities between movies using the cosine similarity.
  • Added the 'Did you mean...?' function to the recommender in order to make the searching process easier.

Medium Post

  • Used a sample rating dataset: 10 movies and 10 users
  • Found similar movies to a selected movie using the NearestNeighbors() in the sklearn library which applies the cosine similarity method.
  • Predicted the unknown rating for the movie using the weighted average of ratings for the similar movies by the user.
  • Built a movie recommender using the algorithm and applied it to the real movie dataset.

yohanjeong_portfolio's People

Contributors

yjeong5126 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.