This project expands upon project two by adding some 'Rules of Thumb' that aggregate trading signals, uses some additional machine learning techniques, and attempted to use some backtesting techniques.
The code begins with a function that gets historical financial data from the Yahoo Finance API using the yfinance python wrapper. It then begins to construct the rules of thumb by checking to see whether each indicator was correct. It checks for a correct positive indiactor using np.where, which will populate the cells with 1s and 0s, and then there's a line of iloc code that adds a 1 if the indicator successfuly predicted negative movement. The code then takes the rolling sum of that column to see how many times the indicator was correct in a given time period, and then used idxmax to determine the name of the indicator with the most correct answers in a given period. It then creates a new column that has the value of the indicator that's been corret most often over the given period. The other types of rules of thumb are created when the trading indicators agree with each other. I started out by seeing when more than half the indicators agreed, but that almost never happened so the three rules of thumb are when two different indicators agree, two momentum indicators agree, or when two volume indicators agree. The function returns a new dataframe that only contains the trading indicators and whether or not daily return is positive or negative.
There are then three functions that pass the data to a random forest indicator, a random forest indicator with XG Boost from the XGBoost library, and the linear regression indicator from SK Learn.
Lastly there are several sections of uncessful code. There's an attempt to use the Cross Validation predictor from the SK Learn library, and then three attempts to backtest the strategies. The first one uses a package called backtesting, which was able to sucessfully evaluate the random forest model but not the other ones. (The example code uses a different ML strategy.) I then tried to use the trading singals I had already generated in the backtester package. I was unable to get the package to read the data with an additoinal frame, so I renamed the trading signals volume, an expected value that would not be used to backtest the strategy. I got the package to read the data, but got a new error that I was unable to track down. The attempts to use those have been added as additional notebooks. Lastly I attemped to use the backtesting from the previous project, but the way the functions returned the data created problems. I will continue to try to use Backtesting, if that doesn't work I can simply have the functions return the simpler backtesting from project two.