Git Product home page Git Product logo

bitcoin's Introduction

Bitcoin Price Prediction

Demo on Youtube

Introduction

This repository contains code on building a serverless data engineering pipeline and results from SageMaker machine learning models to predict Bitcoin price and cluster community labels.

It consists of two parts:

Part 1: Lambda/Web Scraper

It contains two sets of lambda function, .Price and .Text.

producerPrice: triggered by CloudWatch presumably once a day to scrape the specific website, merge into DynamoDB to ensure uniqueness, and then send to SQS instance "producer".

consumerPrice: triggered by EventWatch inside SQS instance "producer" to receive and rearrange messages into a dataframe and save as csv file into S3 bucket "ids706"

producerText: triggered by CloudWatch presumably every 10 - 20 mins to scrape the specific Bitcoin community, merge into DynamoDB to ensure uniqueness, and then send to SQS instance "sentiment".

consumerPrice: triggered by EventWatch inside SQS instance "sentiment" to receive and rearrange messages into a dataframe, interact with AWS Comprehend, and save the final sentiment analyses as .csv file into S3 bucket "ids706"

Part 2: SageMaker

Import accumulated datasets from a certain period of web scraping and train a linear time series model to predict Bitcoin price and LDA model to cluster community review.

Complete Basic Work Flow

workflow

First, producer lambda is trigger by CloudWatch and will scrape Bitcoin price once a day and community review every 10-20 mins from specific websites and update the corresponding DynamoDB tables and S3 buckets.

Next, producer lambda function extracts data from DynamoDB and sends messages to SQS.

Afterwards, consumer lambda function is triggered by the EventWatch and transforms messages from SQS queues.

Then, interact with AWS Comprehend to gain sentiment analysis.

Lastly, connect with AWS S3 bucket and export data as .csv into it.

Train the SageMaker Machine Learning model with accumulated data and gain prediction and cluster.

Tools and Services involved

The following aws services are included:

  1. CloudWatch
  2. EventWatch
  3. Lambda
  4. DynamoDB
  5. SQS
  6. Comprehend
  7. SageMaker
  8. S3

Note

To determine the rate for CloudWatch and save unnecessary computation power, a careful examination on the community/webiste vitality is required. The main metric is the average time age between two reviews or comments.

Ideally, keep the CloudWatch for producer lambda function enabled to accumulate datasets. However, if not training models with the datasets, keep the EventWatch disabled to save storage and computing power.

Website URL links

Bitcoin price:

https://coincodex.com/crypto/bitcoin/historical-data/

https://coinmarketcap.com/currencies/bitcoin/historical-data/

US Dollar Index:

https://www.marketwatch.com/investing/index/dxy

Standard Poor 500 Index:

https://www.marketwatch.com/investing/index/spx

Ethereum Price:

https://coinmarketcap.com/currencies/ethereum/historical-data/

Bitcoin Community Reviews:

https://www.trustpilot.com/review/bitcoin.com

https://www.investing.com/crypto/bitcoin/chat

bitcoin's People

Contributors

yichenghuang980 avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

parshkov

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.