Built an algorithm using CNN to identify metastatic cancer in small image patches of the tissues. This model can be used for early detection of histopathological cancerous cells so that treating cancer becomes efficient and easier. Due to the extremely large database and the exceedingly high computational load that was being generated i would recommend to delete some images before testing your code. once all the errors have been sorted out and the code works on the relatively smaller database then try running it on the original database of 200k+ images. I have made specific functions in the code to deal with the problem of removing outlier images and subdivision of images into train,test and val folders because this was done as a personal project which is why Kaggle's test set was of no use. Therefore i created validation and test set from the original training set only. I have used transfer learning (VGG16) in order to get a high accuracy. The database can be found at: https://www.kaggle.com/c/histopathologic-cancer-detection As the database is extremely large i would suggest to carry out the operations using a Gpu and to first conduct test runs while resolving errors on a relatively smaller database. I divided the train set into train,test,val to find the accuracies. The test set provided by kaggle is used only if one is applying for the competition which was not the case for me as i did this as my personal project #ML Project
aniketvashishtha / cancer_det_cnn Goto Github PK
View Code? Open in Web Editor NEWCreated a cancer classification Cnn model which classifies the histopathological cancer tissues by using transfer learning.