Git Product home page Git Product logo

tickerrain's Introduction

Tickerrain

TickerRain is an open-source web app that stores and analysis Reddit posts in a transparent and semi-interactive manner.

Overview

A simple webpage will display the sentimental analysis and entities of the last post processed, then it will display DB info and finally three graphs of the most mentioned tickers in Reddit.

Web server

The graphs are updated every 120 seconds and refreshing the page will display the analysis of a new post.

Requirements

Python3 and the following packages:

  • pandas
  • flask
  • redis
  • cairosvg
  • nltk
  • spacy
  • matplotlib
  • asyncpraw
  • cachetools

Other than that you need Cairo, for example, for Ubuntu run apt-get install libpangocairo-1.0-0.

Running

First, make sure you have a Redis DB running.

In the file substoscrap.txt specify what subreddits to analyze.

There are 3 parts, a process to get the submissions and store them in Redis DB, one to process them, and then finally one to run the webserver.

Getting Submissions

Run python news.py <client_id> <client_secrets> with arguments the crendentials for your account reddit API, see more here.

This will start getting posts, comments, and Redditors from Reddit and store them in Redis DB.

Processing Posts

Run python -m spacy download en_core_web_lgto get spacy-model required for processing posts.

Run python process.py, this will connect to the DB and start calculating metrics every 120 seconds, the results will be stored in 3 files, tickers_df_<days>.p.

The metrics computed right now are:

  • Mentions -> Detectes what ticker is being talked about and counts the total mentions of it.
  • Score -> Calculates the log score which takes into account the upvotes and downvotes.
  • Sentiment -> Using Spacy Vader sentimental analysis it aggregates the general sentiment about the ticker.

Flask Web Server

Run python flask_example.py to start the webserver that displays the results, DB infos and the last post being processed. Access it by opening a browser and going to 127.0.0.1:5000

Issues and TODO

Currently, the processing code, using Pandas, needs to be optimized, it needs to use Pandas in a better way. The ticker detection needs to be improved, it emits warnings and misses some.

  • Improve ticker detecting, combining Spacy entities.
  • Optimize Pandas processing.
  • Add more metrics.
  • Improve the design of the Web page.
  • Auto download of tickers.csv from NASDAQ.

tickerrain's People

Contributors

gonvas avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.