Git Product home page Git Product logo

ml_algo_options_trading's Introduction

Machine Learning Algorithm for Options Trading

Screen Shot 2021-07-07 at 6 32 48 PM

"In 2018, the Chicago Board Options Exchange reported that over $1 quadrillion worth of options were traded in the US. "

📌 In this Project, we assumed the role of a quantitative analyst for using a FinTech investing platform. This platform aims to offer investor sophisticated Options Trading mechanism. Using Machine Learning to evaluate our trading algorithm written in Python we strive to remove uncertainty and a human factor to automate Options investment decision making.

Table of content 📔

An executive summary

" For out second project we decided to take challenging, but most rewarding approach - to work on our passion project. "

How this project relates to fintech and machine learning?

We are using the power of Python, machine learning and neural network to build a sophisticated algorithmic trading bot.  Specifically, we would like to in depth explore stock options trading

“Option contracts are a financial derivative that represents the right, but not the obligation, to buy (call) or sell (put) a particular security before expiration date….. Because of the huge diversity of historical options data and their relatively few applications, no open source dataset exists."

Software Version Control

  • Repository ML_Algo_Options_Trading was created on GitHub.

  • Our team made sure files were frequently committed to repository.

  • Commit messages with appropriate level of detail included with each commit. Moreover you can find well commented code in our Jypyter Notebook with analyses and explained results for user lacking a technical understanding

  • Repository organized, relevant information about the project files included.

Data Collection and Preparation

  • API calls from multiple sourses, working on different databases created by using Python and following Python libraries.

Screen Shot 2021-07-15 at 10 57 50 AM

Libraries

  • Pandas - is a software library designed for data analytics that makes it easier to work with data from practically any type of file. Pandas supplies powerful tools for working with time data in particular, and time is a key aspect of financial analysis. Analysts typically compare and measure financial assets—from single stocks to large portfolios—across time.
  • With the combination of Pandas and Jupyter Notebook, you can efficiently import, prepare, and analyze data of any type or quantity.
  • We created a Google Collab group to take advantage of the speed and convenience of collaborative cloud environment.
  • Following libraries were used to analyze the data

Screen Shot 2021-07-14 at 7 31 39 PM

🆕 We used following new libraries not covered in Bootcamp course

Screen Shot 2021-07-13 at 6 53 33 PM

〰️ yfinance is a popular open source library developed by Ran Aroussi as a means to access the financial data available on Yahoo Finance. Yahoo Finance offers a range of market data on stocks, bonds etc. It also offers market news, reports and analysis and additionally options and fundamentals data- setting it apart from some of it’s competitors

〰️ FinTA (Financial Technical Analysis) supports over 80 trading indicators.

〰️ sentiment-investor APIs for stock social media data Get high quality, granular data, on what people are saying on various social platforms about stocks and cryptocurrencies

Machine Learning

  • Jupyter notebook and Google Colab. We used group work feature in Google Collab to be able to simultaneously work on code and make a changes in real time.

  • We created 2 mashine learning models:

    SVC model (Support Vector Classification)

" Support-vector machines are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis"

Screen Shot 2021-07-15 at 2 22 19 PM

LR model(Multinomial Logistic Regression) - new machine learning model that the class hasn't already covered

"Multinomial Logistic Regression is a statistical test used to predict a single categorical variable using one or more other variables. It also is used to determine the numerical relationship between such sets of variables."

Screen Shot 2021-07-15 at 11 45 53 AM

  • Models fit to the training data.

Screen Shot 2021-07-15 at 11 34 42 AM

  • Trained models evaluated by using the testing data. Calculations, metrics, or visualizations that are needed to evaluate the performance included.

Screen Shot 2021-07-15 at 11 32 24 AM

  • Predictions shown by using a sample of new data. Predictions compared

Screen Shot 2021-07-15 at 11 35 17 AM

  • We created a paper trade account on TD AmeriTrade and chose Multinomial Logistic Regression (lr_predicted_signals) as it perfomed better, to make a decision to call or put

Screen Shot 2021-07-14 at 8 09 19 PM

Documentation

  • You can find Code in Jupyter Notebook is well commented with concise, relevant notes.

  • GitHub README.md file includes a concise project overview. We followed step by step Technical requirements for grading team convenience.

  • GitHub README.md file includes detailed usage and installation instructions How to install

  • GitHub README.md file includes examples of the application AND the results with progected revenue.

Presentation

Screen Shot 2021-07-15 at 12 48 03 PM

Project overview

Screen Shot 2021-07-15 at 10 55 06 AM

Work with data

  • Source of data

    The main difficulty we encountered was finding the data. The original source we planned to use - Alpaca - worked for some data. But we needed to find some alternative sourses of data. Some of the ones we tried, like QuantConnect - didnt work. After trial and error we end up using following libriries

Screen Shot 2021-07-15 at 11 50 09 AM

  • Collection, cleanup, and preparation process: Some sourses return data in json format, which has different frame construction. Our solution was to create new data frames, change indexes, concatinate or join some data frames together. The following steps needed to be done:

  • make an API call

  • change indexes

  • drop null values and not used columns

  • create new columns

  • concatenate

It took a lot of time and creative solutions, but we were able to do it.

Screen Shot 2021-07-15 at 1 01 55 PM

The following concatenated dataframe includes all indicators, as well as opening and closing prices.

Screen Shot 2021-07-15 at 12 56 41 PM

Technical Analysis

  • We used yfinance to get ohlcav data from yahoo finance for a period of time Technical Indicator is essentially a mathematical representation based on data sets to forecast price trends. Choosing the right set of indicators and incorporating them into code was another challange.

"(Investopedia)"

Screen Shot 2021-07-15 at 2 08 13 PM

  • Using the indicators we added a signal column to indicate when we should buy a call or sell our call

Sentiment Analysis

Another challenge was to convert data for sentiment analyses into a single data frame and working through that data

Screen Shot 2021-07-15 at 12 23 45 PM

Combined Strategy

Screen Shot 2021-07-15 at 12 33 44 PM

Machine Lerning

Screen Shot 2021-07-15 at 2 47 06 PM

Working on issues

" Discuss any unanticipated insights or problems that arose and how you resolved them."

  1. Finding data.
  2. Unifying the data
  3. Difficulty in pricing options Very hard to put into code a process of forecasting option contract prices. Our solution was to changed the strategy. Instead we are forecasting stock prices. Because stock price directly correlates with option contracts price change we can use machine learning strategies we developeped to make a desision to buy or sell stock.

Next steps

  • Potential next steps for the project would be

    • Adding more indicators or trying different set of indicators
    • Try the strategy on different stocks
    • Incorporate Deep Learning Network
  • If we had more time we would continue our research in following areas:

    • Build a bot or some other interface to automatically placed orders or execute buy/sell in real time
    • Link real accounts to the algorithm and actually start executing our strategy
    • Explore more complex trading strategies, like Iron Condor

"An iron condor is an options strategy consisting of two puts (one long and one short) and two calls (one long and one short), and four strike prices, all with the same expiration date. The iron condor earns the maximum profit when the underlying asset closes between the middle strike prices at expiration. In other words, the goal is to profit from low volatility in the underlying asset."

How to install

Repository
  • Save remote repo from GitHub to your computer (Desktop): in Terminal type:
cd desktop

git clone https://github.com/

Now you can find repo on your desktop

  • Open a Jupyter Lab: In Terminal type command
jupyter lab
  • In Jupyter Lab access saved repo folder
  • Choose [ .ipynb ] file to see the analysis report.
Alpaca API call

In order to successfully run the file you have to generate your own Alpaca key and save it to .env file. Those files are hidden so Hold down the Command, Shift and Period keys (for Mac) to be sure you have it in the same folder as Jupyter Lab notebook

 cmd + shift + [.]

To generate Alpaca key you have to create your own account and request a new key. Save it in .env file, make sure to name the variables ALPACA_API_KEY and ALPACA_SECRET_KEY


ALPACA_API_KEY = '<your key>'
ALPACA_SECRET_KEY = '<your key>'

Those steps are necessary to maintain a security of private information.

Team

📩 Natalia Burrey 📩 Jonah Leggett 📩 Miguel Ortega 📩 Samuel Yang

Screen Shot 2021-07-15 at 10 52 49 AM

License

⭐ MIT LICENSE

Links

ml_algo_options_trading's People

Contributors

nataliaburrey avatar jonahleggett avatar miggs00 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.