Git Product home page Git Product logo

nyukat / gmic Goto Github PK

View Code? Open in Web Editor NEW
162.0 10.0 49.0 492.53 MB

An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization

Home Page: https://doi.org/10.1016/j.media.2020.101908

License: GNU Affero General Public License v3.0

Shell 0.55% Python 39.51% Jupyter Notebook 59.95%
breast-cancer medical-imaging deep-learning pytorch breast-cancer-diagnosis breast-cancer-screening

gmic's Introduction

An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization

Introduction

This is an implementation of the Globally-Aware Multiple Instance Classifier (GMIC) model as described in our paper. The architecture of the proposed model is shown below.

Highlights of GMIC:

  • High Accuracy: GMIC outperformed ResNet-34 and Faster R-CNN.
  • High Efficiency: Compared to ResNet-34, GMIC has 28.8% fewer parameters, uses 78.43% less GPU memory and is 4.1x faster during inference and 5.6x faster during training.
  • Weakly Supervised Lesion Localization: Despite being trained with only image-level labels indicating the presence of any benign or malignant lesion, GMIC is able to generate pixel-level saliency maps (shown below) that provide additional interpretability.

The implementation allows users to obtain breast cancer predictions and visualization of saliency maps by applying one of our pretrained models. We provide weights for 5 GMIC-ResNet-18 models. The model is implemented in PyTorch.

  • Input: A mammography image that is cropped to 2944 x 1920 and are saved as 16-bit png files. As a part of this repository, we provide 4 sample exams (in sample_data/images directory and exam list stored in sample_data/exam_list_before_cropping.pkl), each of which includes 2 CC view images and 2 MLO view images. Those exams contain original mammogrphy images and therefore need to be preprocessed (see the Preprocessing section).

  • Output: The GMIC model generates one prediction for each image: probability of benign and malignant findings. All predictions are saved into a csv file $OUTPUT_PATH/predictions.csv that contains the following columns: image_index, benign_pred, malignant_pred, benign_label, malignant_label. In addition, each input image is associated with a visualization file saved under $OUTPUT_PATH/visualization. An exemplar visualization file is illustrated below. The images (from left to right) represent:

    • input mammography with ground truth annotation (green=benign, red=malignant),
    • patch map that illustrates the locations of ROI proposal patches (blue squares),
    • saliency map for benign class,
    • saliency map for malignant class,
    • 6 ROI proposal patches with the associated attention score on top.

alt text

Update (2021/03/08): Updated the documentation

Update (2020/12/15): Added the preprocessing pipeline.

Update (2020/12/16): Added the example notebook.

Prerequisites

  • Python (3.6)
  • PyTorch (1.1.0)
  • torchvision (0.2.2)
  • NumPy (1.14.3)
  • SciPy (1.0.0)
  • H5py (2.7.1)
  • imageio (2.4.1)
  • pandas (0.22.0)
  • opencv-python (3.4.2)
  • tqdm (4.19.8)
  • matplotlib (3.0.2)

License

This repository is licensed under the terms of the GNU AGPLv3 license.

How to run the code

You need to first install conda in your environment. Before running the code, please run pip install -r requirements.txt first. Once you have installed all the dependencies, run.sh will automatically run the entire pipeline and save the prediction results in csv. Note that you need to first cd to the project directory and then execute . ./run.sh. When running the individual Python scripts, please include the path to this repository in your PYTHONPATH.

We recommend running the code with a GPU. To run the code with CPU only, please change DEVICE_TYPE in run.sh to 'cpu'.

The following variables defined in run.sh can be modified as needed:

  • MODEL_PATH: The path where the model weights are saved.
  • CROPPED_IMAGE_PATH: The directory where cropped mammograms are saved.
  • SEG_PATH: The directory where ground truth segmenations are saved.
  • EXAM_LIST_PATH: The path where the exam list is stored.
  • OUTPUT_PATH: The path where visualization files and predictions will be saved.
  • DEVICE_TYPE: Device type to use in heatmap generation and classifiers, either 'cpu' or 'gpu'.
  • GPU_NUMBER: GPUs number multiple GPUs are available.
  • MODEL_INDEX: Which one of the five models to use. Valid values include {'1', '2', '3', '4', '5','ensemble'}.
  • visualization-flag: Whether to generate visualization.

You should obtain the following outputs for the sample exams provided in the repository (found in sample_output/predictions.csv by default).

image_index benign_pred malignant_pred benign_label malignant_label
0_L-CC 0.1356 0.0081 0 0
0_R-CC 0.8929 0.3259 1 0
0_L-MLO 0.2368 0.0335 0 0
0_R-MLO 0.9509 0.1812 1 0
1_L-CC 0.0546 0.0168 0 0
1_R-CC 0.5986 0.9910 0 1
1_L-MLO 0.0414 0.0139 0 0
1_R-MLO 0.5383 0.9308 0 1
2_L-CC 0.0678 0.0227 0 0
2_R-CC 0.1917 0.0603 1 0
2_L-MLO 0.1210 0.0093 0 0
2_R-MLO 0.2440 0.0231 1 0
3_L-CC 0.6295 0.9326 0 1
3_R-CC 0.2291 0.1603 0 0
3_L-MLO 0.6304 0.7496 0 1
3_R-MLO 0.0622 0.0507 0 0

Data

sample_data/images contains 4 exams each of which includes 4 the original mammography images (L-CC, L-MLO, R-CC, R-MLO). All mammography images are saved in png format. The original 12-bit mammograms are saved as rescaled 16-bit images to preserve the granularity of the pixel intensities, while still being correctly displayed in image viewers.

sample_data/segmentation contains the binary pixel-level segmentation labels for some exams. All segmentations are saved as png images.

sample_data/exam_list_before_cropping.pkl contains a list of exam information. Each exam is represented as a dictionary with the following format:

{'horizontal_flip': 'NO',
  'L-CC': ['0_L-CC'],
  'L-MLO': ['0_L-MLO'],
  'R-MLO': ['0_R-MLO'],
  'R-CC': ['0_R-CC'],
  'best_center': {'R-CC': [(1136.0, 158.0)],
   'R-MLO': [(1539.0, 252.0)],
   'L-MLO': [(1530.0, 307.0)],
   'L-CC': [(1156.0, 262.0)]},
  'cancer_label': {'benign': 1,
   'right_benign': 0,
   'malignant': 0,
   'left_benign': 1,
   'unknown': 0,
   'right_malignant': 0,
   'left_malignant': 0},
  'L-CC_benign_seg': ['0_L-CC_benign'],
  'L-CC_malignant_seg': ['0_L-CC_malignant'],
  'L-MLO_benign_seg': ['0_L-MLO_benign'],
  'L-MLO_malignant_seg': ['0_L-MLO_malignant'],
  'R-MLO_benign_seg': ['0_R-MLO_benign'],
  'R-MLO_malignant_seg': ['0_R-MLO_malignant'],
  'R-CC_benign_seg': ['0_R-CC_benign'],
  'R-CC_malignant_seg': ['0_R-CC_malignant']}

In their original formats, images from L-CC and L-MLO views face right, and images from R-CC and R-MLO views face left. We horizontally flipped R-CC and R-MLO images so that all four views face right. Values for L-CC, R-CC, L-MLO, and R-MLO are list of image filenames without extensions and directory name.

Preprocessing

Run the following commands to crop mammograms and calculate information about augmentation windows.

Crop mammograms

python3 src/cropping/crop_mammogram.py \
    --input-data-folder $DATA_FOLDER \
    --output-data-folder $CROPPED_IMAGE_PATH \
    --exam-list-path $INITIAL_EXAM_LIST_PATH  \
    --cropped-exam-list-path $CROPPED_EXAM_LIST_PATH  \
    --num-processes $NUM_PROCESSES

src/import_data/crop_mammogram.py crops the mammogram around the breast and discards the background in order to improve image loading time and time to run segmentation algorithm and saves each cropped image to $PATH_TO_SAVE_CROPPED_IMAGES/short_file_path.png using h5py. In addition, it adds additional information for each image and creates a new image list to $CROPPED_IMAGE_LIST_PATH while discarding images which it fails to crop. Optional --verbose argument prints out information about each image. The additional information includes the following:

  • window_location: location of cropping window w.r.t. original dicom image so that segmentation map can be cropped in the same way for training.
  • rightmost_points: rightmost nonzero pixels after correctly being flipped.
  • bottommost_points: bottommost nonzero pixels after correctly being flipped.
  • distance_from_starting_side: records if zero-value gap between the edge of the image and the breast is found in the side where the breast starts to appear and thus should have been no gap. Depending on the dataset, this value can be used to determine wrong value of horizontal_flip.

Calculate optimal centers

python3 src/optimal_centers/get_optimal_centers.py \
    --cropped-exam-list-path $CROPPED_EXAM_LIST_PATH \
    --data-prefix $CROPPED_IMAGE_PATH \
    --output-exam-list-path $EXAM_LIST_PATH \
    --num-processes $NUM_PROCESSES

src/optimal_centers/get_optimal_centers.py outputs new exam list with additional metadata to $EXAM_LIST_PATH. The additional information includes the following:

  • best_center: optimal center point of the window for each image. The augmentation windows drawn with best_center as exact center point could go outside the boundary of the image. This usually happens when the cropped image is smaller than the window size. In this case, we pad the image and shift the window to be inside the padded image in augmentation. Refer to the data report for more details.

Outcomes of preprocessing

After the preprocessing step, you should have the following files in the $OUTPUT_PATH directory (default is sample_output):

  • cropped_images: a folder that contains the cropped images corresponding to all images in the sample_data/images.
  • data.pkl: the pickle file of a data list that includes the preprocessing metadata for each image and exam.

Reference

If you found this code useful, please cite our paper:

An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization
Yiqiu Shen, Nan Wu, Jason Phang, Jungkyu Park, Kangning Liu, Sudarshini Tyagi, Laura Heacock, S. Gene Kim, Linda Moy, Kyunghyun Cho and Krzysztof J. Geras
Medical Image Analysis 2020

@article{shen2020interpretable, 
title={An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization},
author={Shen, Yiqiu and Wu, Nan and Phang, Jason and Park, Jungkyu and Liu, Kangning and Tyagi, Sudarshini and Heacock, Laura and Kim, S Gene and Moy, Linda and Cho, Kyunghyun and others},
journal={Medical Image Analysis},
pages={101908},
year={2020},
publisher={Elsevier}

}

Reference to previous GMIC version:

Globally-Aware Multiple Instance Classifier for Breast Cancer Screening
Yiqiu Shen, Nan Wu, Jason Phang, Jungkyu Park, S. Gene Kim, Linda Moy, Kyunghyun Cho and Krzysztof J. Geras
Machine Learning in Medical Imaging - 10th International Workshop, MLMI 2019, Held in Conjunction with MICCAI 2019, Proceedings. Springer , 2019. p. 18-26 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11861 LNCS).

@inproceedings{shen2019globally, 
title={Globally-Aware Multiple Instance Classifier for Breast Cancer Screening},
    author={Shen, Yiqiu and Wu, Nan and Phang, Jason and Park, Jungkyu and Kim, Gene and Moy, Linda and Cho, Kyunghyun and Geras, Krzysztof J},
    booktitle={Machine Learning in Medical Imaging: 10th International Workshop, MLMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13, 2019, Proceedings},
    volume={11861},
    pages={18-26},
    year={2019},
    organization={Springer Nature}}

gmic's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gmic's Issues

Potential issue in patch map display

First congratulations for the great project!
I tried to run the run.sh file with the --visualization-flag on, the resulting patch maps were always aligned to the left border of the image and do not correspond to the activated regions in the heat maps.
image

image

image

I haven't figured out if this is a problem only in the visualization or it also affects the prediction accuracy.
Thank you very much!

Discrepancies between readme and example files

Hi!
I was trying to reproduce the results for the shared example images, but it seems that the directory "sample_data" wasn't updated properly.
The readme says:
"As a part of this repository, we provide 4 sample exams (in sample_data/cropped_images directory and exam list stored in sample_data/data.pkl), each of which includes 2 CC view images and 2 MLO view images.",
but there's no cropped_images folder in sample_data.
Moreover it seems to me that the images stored in "sample_data/images" are not the "original" ones but the cropped ones. But the exam_list_before_cropping.pkl file doesn't include the "best center" coordinates. I tried using them anyway as if those were the originals because the cropping and best center steps just use the "breast region" of the image, but I couldn't reproduce your results. Maybe applying the erosion and dilation steps over the already cropped image is not allowing me to get the rightmost and bottomost pixels positions in the right manner and then my extraction of the best center is differing from your computation.
If you could please upload the complete versions of the images in sample_data/images, or the exam_list.pkl file with the best_centers coordinates included, I would really appreciate it.
Thank you in advance!

Tool for running the code

I want to reproduce the results of your paper. I have used the google colab but there are lots of issues. Can you plz assist me with which tool you have used for implementation?

Training

Hi there,

Thanks for sharing the code. I have a question in terms of the training part.
Given that the GMIC can be trained end-to-end, so the f_g and f_l would be updated simultaneously during training. But, at the early stage, the saliency maps could make mistake and cause the retrieve_roi function to extract incorrect patches (i.e., background), would that affect the convergence of the local module?

Cheers

RetrieveROIModule on left border and mask not updated for some images

Hey,

see visualisation for 0_R-MLO.png
image

This occurs for only some ROIs eg 0_L-MLO.png:
image

I'm exploring around here but any pointers would be much appreciated.

Also, the max value approach will be counfounded by markers, and probably perform poorly for women with very dense breasts eg:
image

In my case, all patches can't get past x=0 (stuck on the left border).

Is this just me?

Fine Tunning

Hi again!
In order to fine-tune your model with my own images, I would like to know the exact hyperparameter combination that was used in each of the five models from which you are sharing their weights.
In your paper, you present the range of values that were used in the random search, and that you chose the best-performing 5 models. Could you share those hyperparameters?
Thanks!

Does this model need pixel-level segmentation masks of malignant and benign lesions?

Hi, according to your paper, it seems that training and inference of this model only requires image-level labels, no need for annotations of malignant and benign lesions. However, from the code in this repo, segmentation paths of malignant and benign lesions are required to run the code. Just wonder, if I don't have segmentation masks of malignant and benign lesions, how can I train and test your model on my own images? Waiting and appreciate for your response.

Batch Size

Hi, first of all, congratulations for your work!
While reading the code and the paper, I couldn't find which batch size did you use during training. Could you share that information?
Moreover, the code you are shearing here appears to be just an inference version, I would like to fine-tune the model to some images from a private dataset, so if you could share the training version you used it would be very nice. Furthermore, It would be nice if this repo was complete by itself, to make an inference with your own data you have to preprocess (cropping and reshaping) the images separately using 'breast_cancer_classifier' repo's code.
Thank you!

Missing parameter?

When loading models I get:

_IncompatibleKeys(missing_keys=[], unexpected_keys=['shared_rep_filter.weight'])

It looks like a 256x256x4x4 (4x4 conv?) weight but not implemented in the repo. Any chance this can be added?

Fine tuning

Hi Yiqiu,

I am considering to perform fine tuning of models in this repo on a couple of mammograms with image-level labels (only normal and malignant labels). I have the following concerns to seek your kind advice.

  1. To fine tune the pretrained model, what loss function should I use? Should I use the binary CELoss of predicted output (line 183 of the run_model.py) and the ground-truth image label as the loss function or should I use the loss function given in Eq.(13) in your paper to re-train the model?

  2. Is it reasonable to use the predicted benign probability, generated from line 183 of the run_model.py, as the probability of normal cases for fine tuning the model?

  3. For image pre-processing, which method should be applied a). for each single training image, remove its mean value and divided by its std; b). for each batch_size training images, remove the mean value and divided by std (mean and std are calculated on all images from a mini-batch); c). for the whole training images, remove the mean value and divided by std (mean and std are calculated on all images from whole training set)

Thanks for your time and wait for your feedback.

How to run GMIC on CBIS-DDSM?

Hi. I want to use GMIC to predict the benign/malignant label of CBIS-DDSM images. This repository provides five models pretrained on NYUBCS, and a run.sh script. To predict, I simply replaced the four NYUBCS samples (16 images) in sample_data/images/ by four CBIS-DDSM samples, and then executed run.sh. However, the sample_output/predictions.csv gave small probability values, mostly < 0.1. Therefore it is hard to tell benign/malignant. Which part went wrong?

I read the paper, which mentions "To preprocess mammography images in CBIS-DDSM, we first found the largest connected component containing only non-zero pixels to locate the breast. We then applied erosion and dilation to refine the breast margin. Lastly, we re-oriented all mammography images so that the breasts are always on the left side of the image. All images are resized to 2944 × 1920 pixels and pixels values were normalized to the range [0,1]." Obviously there were some preprocessing steps. Are these preprocessing steps included in run.sh? If not, where to obtain the code or how to implement these steps?

Thank you.

without segmentation path

hi,

How i can run the model without segmentation path and segmentation folder inside sample_data
I have cropped images and i want visualization only.
Is it possible ?

Request for help regarding how to implement the article titled: GMIC

Greetings and Regards

My name is Mohsen Rostami, a final semester student of computer engineering majoring in artificial intelligence

Sorry for disturbing your time, I will give you my brief explanation

Due to my great interest in artificial intelligence and studying in this field, I have been spending some time on the article that you have designed and I am learning and reviewing it, but unfortunately, due to the fact that my scientific level is preliminary, I have not yet succeeded in I have not found the dataset of your article and how to implement it
Therefore, I am asking you to guide me on how to receive and download the datasets introduced in the article, and also please tell me how to label the images and the codes with which we can link this dataset to the main program code and load it. do Please send and guide me so that I can also learn useful things from my scientist and be grateful to you.
Thank you for applying using me

Potential issue with dataloading flips

Hi,

Thanks a lot for the nice codebase, I'm trying to run inference on my own dataset and I'm seeing poor performance. I see that in the run_model script, horizontal_flip is always set to false (a boolean) 1, but in flip_image it is checked against a string 2 ('YES' or 'NO') -- is this intended behaviour?

SystemExit: 2

thnx 4 sharing yr helpful code I installed all dependencies from requirements.txt & also I'm in the project directory when running bash run.sh
usage: ipykernel_launcher.py [-h] --model-path MODEL_PATH --data-path
DATA_PATH --image-path IMAGE_PATH
--segmentation-path SEGMENTATION_PATH
--output-path OUTPUT_PATH
[--device-type {gpu,cpu}]
[--gpu-number GPU_NUMBER]
[--model-index MODEL_INDEX]
[--visualization-flag]
ipykernel_launcher.py: error: the following arguments are required: --model-path, --data-path, --image-path, --segmentation-path, --output-path
An exception has occurred, use %tb to see the full traceback.

SystemExit: 2

plz help me!

ROI

Hi, I am wondering is there any gradient flow in the retrieve_roi process?

GMIC- Segmentation

Hello, I`m testing the results from the code for new images in a federal brazilian hospital. When i download the samples from git, it comes already cropped at sample images folder and besides that it looks like has a modification on histograms. Is this an error, the images wouldn´t come without cropping? Because new pacient´s samples images that could be introduced in the same place that is in the sample images folder comes with segmentation folder include before running the code. And when it replaced with new images for run the code again, the folder segmentation, not appear for the new sample images from my own.
Thanks for your attention,

Andressa.

The reason set the classification as a 2 class multi-label problem

Hi, thank you very much for your sharing.
I am newer to mammography classification.
I have read several papers but some of them treat this problem as 2-class classification problem.
May I know the reason why this paper utilizing multi-label to deal with this problem?

Thank you very much.

modules.py error

After running breast_cancer_classifier, i copied cropped images, data.pkl and cropped_exam_list.pkl into GMIC/sample_data and then ran bash run.sh command. Getting below error

Traceback (most recent call last):
File "src/scripts/run_model.py", line 297, in
main()
File "src/scripts/run_model.py", line 293, in main
turn_on_visualization=args.visualization_flag,
File "src/scripts/run_model.py", line 253, in start_experiment
output_df = run_single_model(single_model_path, data_path, parameters, turn_on_visualization)
File "src/scripts/run_model.py", line 218, in run_single_model
output_df = run_model(model, exam_list, parameters, turn_on_visualization)
File "src/scripts/run_model.py", line 181, in run_model
output = model(tensor_batch)
File "/home/username/lib/python3.5/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/username/GMIC/src/modeling/gmic.py", line 123, in forward
small_x_locations = self.retrieve_roi_crops.forward(x_original, self.cam_size, self.saliency_map)
File "/home/username/GMIC/src/modeling/modules.py", line 339, in forward
assert h_h == h, "h_h!=h"
AssertionError: h_h!=h

How to interpret GMIC's prediction result on CBIS-DDSM ?

Hi. We tried to reproduce the result described in the GMIC paper on CBIS-DDSM, but did not seem to make it. Below are what we did:

  • We downloaded CBIS-DDSM from https://wiki.cancerimagingarchive.net/display/Public/CBIS-DDSM, along with the four csv files (Mass/Calc, Training/Test)
  • The paper mentions that GMIC was evaluated on only a subset of CBIS-DDSM, which contains 188 exams defined by Shen et al. We identified and extracted this subset.
  • The sample_data/images contains 4 exams each of which includes 4 the original mammography images (L-CC, L-MLO, R-CC, R-MLO). Specifically, 0_R-CC, 0_R-MLO, 2_R-CC, 2_R-MLO have a benign_label of 1; 1_R-CC, 1_R-MLO, 3_L-CC, 3_L-MLO have a malignant_label of 1. To satisfy this configuration, we selected four exams, from the 188-exams subset, to have the same configuration. As a result, the selected four exams were P_02409, P_00146, P_01678, P_01669. The images were in DICOM format.
  • We used the python code snippet described in the metarepository's README to convert DICOM to PNG. The bitdepth parameter was set to 16. https://github.com/nyukat/mammography_metarepository#images
  • After DICOM-to-PNG conversion, we replaced the corresponding png files in sample_data/images by the converted png files of the four selected exams from CBIS-DDSM:
    • 0_L-CC: Unaltered
    • 0_L-MLO: Unaltered
    • 0_R-CC: Replaced by CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Calc-Training_P_02409_RIGHT_CC/08-07-2016-DDSM-41108/1.000000-full mammogram images-67359/1-1.png
    • 0_R-MLO: Replaced by CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Calc-Training_P_02409_RIGHT_MLO/08-07-2016-DDSM-46691/1.000000-full mammogram images-54510/1-1.png
    • 1_L-CC: Unaltered
    • 1_L-MLO: Unaltered
    • 1_R-CC: Replaced by P_00146 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Mass-Training_P_00146_RIGHT_CC/07-20-2016-DDSM-61365/1.000000-full mammogram images-07790/1-1.png
    • 1_R-MLO: Replaced by P_00146 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Mass-Training_P_00146_RIGHT_MLO/07-20-2016-DDSM-90212/1.000000-full mammogram images-33341/1-1.png
    • 2_L-CC: Unaltered
    • 2_L-MLO: Unaltered
    • 2_R-CC: Replaced by P_01678 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Calc-Training_P_01678_RIGHT_CC/08-07-2016-DDSM-63063/1.000000-full mammogram images-39590/1-1.png
    • 2_R-MLO: Replaced by P_01678 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Calc-Training_P_01678_RIGHT_MLO/08-07-2016-DDSM-33342/1.000000-full mammogram images-59283/1-1.png
    • 3_L-CC: Replaced by P_01669 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Mass-Training_P_01669_LEFT_CC/07-20-2016-DDSM-68732/1.000000-full mammogram images-80465/1-1.png
    • 3_L-MLO: Replaced by P_01669 CBIS-DDSM-All-doiJNLP-zzWs5zfZ/CBIS-DDSM/Mass-Training_P_01669_LEFT_MLO/07-20-2016-DDSM-14752/1.000000-full mammogram images-57568/1-1.png
    • 3_R-CC: Unaltered
    • 3_R-MLO: Unaltered.
      Note that eight files remained unaltered because their benign_label and malignant_label are both 0, and CBIS-DDSM has no normal images to substitute. Here is a snapshot of the 16 input image: https://freeimage.host/i/irlidu
  • We executed run.sh, and then got the output predictions.csv:
image_index benign_pred malignant_pred benign_label malignant_label
0_L-CC 0.1356 0.0081 0 0
0_R-CC 0.1747 0.0323 1 0
0_L-MLO 0.2368 0.0335 0 0
0_R-MLO 0.0696 0.0104 1 0
1_L-CC 0.0508 0.0144 0 0
1_R-CC 0.0515 0.0087 0 1
1_L-MLO 0.0545 0.0154 0 0
1_R-MLO 0.1115 0.0149 0 1
2_L-CC 0.0746 0.0160 0 0
2_R-CC 0.0809 0.0228 1 0
2_L-MLO 0.0953 0.0086 0 0
2_R-MLO 0.1155 0.0168 1 0
3_L-CC 0.2134 0.0407 0 1
3_R-CC 0.2945 0.2116 0 0
3_L-MLO 0.1639 0.0165 0 1
3_R-MLO 0.0722 0.0303 0 0
  • We were confused by the above result. The eight CBIS-DDSM-substituted images had very low probability values for both benign_pred and malignant_pred. For instance,
    • 0_R-CC and 0_R-MLO have a benign_label of 1, but their benign_pred values are just 0.1747 and 0.0696.
    • 3_L-CC and 3_L-MLO have a malignant_label of 1, but their malignant_pred values are just 0.0407 and 0.0165.

We wonder which part went wrong?

The five pretrained models provided in the models directory were trained on the NYUCBS dataset, which is proprietary and thus unavailable to us. Do we have to retrain GMIC on CBIS-DDSM in order to get good result on CBIS-DDSM? If so, how to perform re-training? Where can we find the code to retrain?

Thank you.

train from scratch

Hi,

Just wondering if the code for training can be released? and could you please provide some hint about how to train this model from scratch?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.