Git Product home page Git Product logo

sustainable-investing's Introduction

sustainable-investing

A project leveraging the power of open-data and python to help individual investors take a data-driven approach to ESG investing. Specifically

how the script works

This python script takes two inputs:

  • The text you wish to query
  • The sector you wish to filter through

These two inputs will filter through the equities database provided by https://github.com/JerBouma/FinanceDatabase to perform this query. In particular, the text query input filters through the description of each equity in the databse to filter the equity stocks which include the keyword. The sector input filters this by type of sector. To see the full list of sectors please consult (https://github.com/JerBouma/FinanceDatabase/tree/master/Database/Equities/Sectors). The default country selected is United States.

The reason why US stocks are selected is because the latter part of the script takes the ticker symbols extracted to make a query against (https://github.com/jadchaar/sec-edgar-downloader) to get the latest 10-K reports filed by each company.

From this the script converts the extracted .htm files into text before then applying tokenization techniques to extract only the most important keywords of the text (removing stop-words, unnecessary words, entities, etc.).

Finally, the script builds an Latent Dirichlet Allocation (LDA) model to perform topic modelling - the default selected number of topics for this unsupervised model is 10 (after having done an hyper-parameter optimisation investigation in a separate notebook). This can be changed in the main.py

After applying the model, the dominant topic is attributed to each ticker in an aggregate dataframe which is then saved as a .csv in an output directory.

installation guide

Create a python environment and run pip install -r requirements.txt to download the required database

To download the required language model, run the following in your terminal: python -m spacy download en_core_web_lg

For more information on the language models provided by Spacy, see: https://spacy.io/models/en (you may, for example, wish to use the smaller model version; if so, please make sure to update the main.py accordingly.)

sustainable-investing's People

Contributors

ltw94 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.