- Import dataset โ๏ธ
- Data preprocessing โ๏ธ
- Create
X
andy
โ๏ธ - Replace datetimes with strings โ๏ธ
- Replace ranges with mid point โ๏ธ
- Handle missing values โ๏ธ
- Encode catergorical variables โ๏ธ
- LabelEncoder โ๏ธ
- OneHotEncoder โ๏ธ
- Avoid the dummy variable trap โ๏ธ
- Feature scaling โ๏ธ
- Split to
training
andtest
set โ๏ธ
- Create
- Build Artificial Neural Network โ๏ธ
- Import Keras โ๏ธ
- Initialise the ANN โ๏ธ
- Add the input layer โ๏ธ
- Add the first hidden layer โ๏ธ
- Add the second hidden layer โ๏ธ
- Add the output layer โ๏ธ
- Compile the ANN โ๏ธ
- Fit the ANN to the training set โ๏ธ
- Evaluate with k-Fold Cross Evaluation โ๏ธ
- k-fold cross validation โ๏ธ
- Present results โ๏ธ
- Predict unseen data โ๏ธ
- Predict the test results โ๏ธ
- Make the confusion matrix โ๏ธ
- Present the confusion matrix โ๏ธ
- Improve model โ๏ธ
- Improve code โ๏ธ
- Tasks as functions โ๏ธ
- Comments and notes โ๏ธ
- Strategy explained โ๏ธ
- Run on Colab โ๏ธ
- Using Python, create the best performing neural networks algorithm you can to predict recurrence rates of breast cancer based upon the variables provided in the attached breast cancer spreadsheet.
- Document:
- the settings you tested (and rationale for the strategy you took) along the way to optimal performance.
- a screenshot of you using the trained algorithm to make a prediction on an unseen piece of data.