Kickstarter Campaign Classification [Machine Learning]

-- Project Status: [Active]

Project Intro/Objective

Kickstarter is a US based global crowd funding platform focused on bringing funding to creative projects. Since the platform’s launch in 2009, the site has hosted over 159,000 successfully funded projects with over 15 million unique backers. Kickstarter uses an “all-or-nothing” funding system. This means that funds are only dispersed for projects that meet the original funding goal set by the creator.

Kickstarter earns 5% commission on projects that are successfully funded. Currently, less than 40% of projects on the platform succeed. The objective is to predict which projects are likely to succeed so that these projects can be highlighted on the site either through 'staff picks' or 'featured product' lists.

Collaborators

Name	Github Page	Personal Website
Nateé Johnson	nateej1	---
Misha Berrien	mishaberrien	www.mishaberrien.com

Methods Used

Machine Learning
Data Visualization
Predictive Modeling

Technologies

Python
Pandas, jupyter

Project Description

In order to increase the number of successful campaigns, we propose two related solutions:

Predict Successful Campaigns and promote those with the lowest predicted probability of being successful.
Contact creators from those campaigns that are just below the “success” margin and give them insights that will help them succeed.

Getting Started

Clone this repo.
A sample of the the deduplicated dataset can be found in the data_sample folder here.
In order to reproduce results first open the "results" file located in the results folder here. Then change the two file paths at the beginning of the document

from:

kick_deduped = pd.read_csv('../../data/02_intermediate/kick_deduped.csv.zip')
cluster_features_df =  pd.read_csv('../../data/03_processed/KNN_cluster_features_.csv'))

to:

kick_deduped = pd.read_csv('../../data_sample/kick_deduped_sample.csv.zip')
cluster_features_df =  pd.read_csv('../../data_sample/KNN_cluster_features_.csv'))

then run the results file.

The data processing/transformation scripts are being kept in the src folder here
A data dictionary can be found in the references folder here

Featured Notebooks/Analysis/Deliverables

This file structure is based on the DSSG machine learning pipeline.

mberrien-fitzsimons / kickstarter_campaign_classification Goto Github PK

kickstarter_campaign_classification's Introduction

Kickstarter Campaign Classification [Machine Learning]

-- Project Status: [Active]

Project Intro/Objective

Collaborators

Methods Used

Technologies

Project Description

Getting Started

Featured Notebooks/Analysis/Deliverables

kickstarter_campaign_classification's People

Contributors

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent