Git Product home page Git Product logo

amazon-vine-analysis's Introduction

Amazon Vine Analysis

Overview

Vine Amazon Vine is an invitation-only program where reviewers receive free items in exchange for product reviews. Vine reviews have a disclaimer on the Amazon website for transparency. This project aims to confirm there is no positive product review bais for these types of paid reviews. The analysis was performed in Google Colaboratory with PySpark and output as a ipynb.

Results

How many Vine Reviews and Non-Vine Reviews were there?

This dataset of Amazon reviews in the Books product category. The schema includes:

  • marketplace
  • customer id
  • review id
  • product id
  • book title
  • star rating
  • helpful votes
  • total votes
  • vine designation
  • review inforamtion

To clean the dataset, I first filtered the data to include items where the total_votes >= 20 votes, and the helpful votes to total votes ratio was >= 50%.

Then I created two new dataframes that filter for Vine votes, to have two separate datasets for paid reviews and unpaid reviews. A simple count function of each dataframe shows the number of vine reviews (paid) and non-vine reviews (unpaid). Number of Reviews

How many Vine reviews were 5 stars? How many non-Vine reviews were 5 stars?

Filtering each dataset by the "star_rating" column, reveals the number of 5 star reviews for each group. Number of 5 Star Reviews

What percentage of Vine reviews were 5 stars? What percentage of non-Vine reviews were 5 stars?

Comparing the percentage of 5-star reviews as a part of the whole gives us a better idea how many reveiws are 5 star across both groups (even with smaller population sizes). Percentage of 5 Star Reviews

Summary

There is no strong positivity bias for reveiws in the Vine program in the books category. 40.52% of Vine reviews were 5 stars, and 45.71% of unpaid reviews were rated 5 stars. One additional analysis to further confirm would be to compare the rating of the other star reviews to see if a similiar distribution of data or any skewing of the data.

amazon-vine-analysis's People

Contributors

alydavis avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.