financial_news_headlines_sentiment_analysis's Introduction

Financial_News_Headlines_Sentiment_Analysis

This is a data mining project. There are three goals of this project.

First, I want to find the characteristics of securities derived from social media factors have significant power in explaining the time-series variation in daily returns. The Social Media Factor, the “sixth” factor, is distinct from the traditional five factors authored by Fama and French.

Second, I predict excess return using sentiment scores and Fama-French factors, as well as understand what features explain the variance the most.

Third, I want to explore how our extracted sentimental scores are related or affecting the fluctuations of stock returns.

In this project, I applied several data mining and machine learning techniques.

Fin-BERT and LSTM NLP models that convert text into sentiment scores.
Fama-French model with social media as the six factor
PCA dimension reduction
data wrangling/feature engineering/model engineering
K-mean clustering
Regression models: linear, SVR, decision tree, random forest, bagging, voting, gradient boosting and ada boost.

To access the full data source, please visit https://www.kaggle.com/priyapitre/ff-project and scroll to the bottom to download csvs and pre-trained LSTM model.

Recommend Projects

elee190 / financial_news_headlines_sentiment_analysis Goto Github PK