aubricot / computer_vision_with_eol_images Goto Github PK

Testing different computer vision methods (object detection, image classification) to do customized, large-scale image processing for Encylopedia of Life images.

License: MIT License

Jupyter Notebook 98.12% Python 1.88%

image-classification object-detection eol encyclopedia-of-life computer-vision image-management

computer_vision_with_eol_images's Introduction

Computer Vision with EOL v3 Images

Testing different computer vision methods (object detection, image classification) to do customized, large-scale image processing for Encyclopedia of Life v3 database images (square, centered crops; image content tags; etc). Runs in Tensorflow 2 and Python 3.
Last updated 1 March 2023

_{^{Images a-c are hosted by Encyclopedia of Life (a. Choeronycteris mexicana, licensed under CC BY 2.0, b. Hippotion celerio, licensed under CC BY-NC-SA 3.0, c. Cuculus solitarius (left) and Cossypha caffra (right), licensed under CC BY-SA 2.0).}}

The Encyclopedia of Life (EOL) is an online biodiversity resource that seeks to provide information about all ~1.9 million species known to science. A goal for the latest version of EOL (v3) is to better leverage the older, less structured image content. To improve discoverability and display of EOL images, automated image processing pipelines that use computer vision with the goal of being scalable to millions of diverse images are being developed and tested.

Project Structure

Object detection for image cropping

Three object detection frameworks (Faster-RCNN Resnet 50 and either SSD/R-FCN/Faster-RCNN Inception v2 ¹ detection via the Tensorflow Object Detection API and YOLO v3 ² via Darkflow) were used to perform square cropping for EOL images of different groups of animals (birds, bats, butterflies & moths, beetles, frogs, carnivores, snakes & lizards) by using transfer learning and/or fine-tuning.

Frameworks differ in their speeds and accuracy: YOLO is the fastest but least accurate, while Faster RCNN is the slowest but most accurate, with MobileNet SSD and R-FCN falling somewhere in between ² ³ ⁴. The model with the best trade-off between speed and accuracy for each group was selected to generate final cropping data for EOL images.

After detection, bounding boxes of detected animals are converted to square, centered cropping coordinates in order to standardize heterogenous image gallery displays.

For birds, pre-trained object detection models were used to detect birds.
For bats and butterflies & moths, object detection models were custom-trained to detect one class (either bats or butterflies & moths) using EOL user-generated cropping data (square coordinates around animal(s) of interest within each photo).
For beetles, frogs, carnivores and snakes & lizards, object detection models were custom-trained to detect all classes simultaneously using EOL user-generated cropping data.

➡️ 🌱 Click here to get started.

Demo video: Run your own images through the pre-trained EOL object detector in under 2 minutes.

_{^{Object detection results using trained multitaxa detector model displayed in a Google Colab Notebook. Image is hosted by Encyclopedia of Life (Lampropeltis californiae, licensed under CC BY-NC 4.0.}}

Classification for image tagging

Two classification frameworks (MobileNetSSD v2 ¹¹, Inception v3 ⁵) were used to perform image tagging for different classes of EOL images (flowers, maps/labels/illustrations, image ratings) by using transfer learning and/or fine-tuning.

Frameworks differ in their speed and accuracy: MobileNetSSD v2 is faster, smaller, and less accurate and Inception v3 is slower, larger, and more accurate ⁵ ⁶. The model with the best trade-off between speed and accuracy for each group was selected to generate final tagging data for EOL images.

While object detection includes classification and localization of the object of interest, image classification only includes the former step ⁷. Classification is used to identify images with flowers present, images of maps/collection labels/illustrations, and to generate image quality ratings. These tags will allow users to search for features not already included in image metadata.

For the flower classifier, models were trained to classify images into flower, fruit, entire, branch, stem or leaf using the PlantCLEF 2016 Image dataset as training data ⁸.
For the flower/fruit classifier, models were trained to classify images into flower/fruit or not flower/fruit using manually-sorted EOL images as training data.
For the image type classifier, models were trained to classify images into map, herbarium sheet, phylogeny, illustration, or none using Wikimedia commons, Flickr BHL, and EOL images as training data.
For the image rating classifier, models were trained to classify image quality rating classes 1-5 (worst to best) using EOL user generated training data.

➡️ 🌱 Click here to get started.

_{^{Image classification results using trained flower/fruit classification model displayed in a Google Colab Notebook. Image is hosted by Encyclopedia of Life (Leucopogon tenuicaulis, licensed under CC BY 3.0).}}

Object detection for image tagging

Three object detection frameworks (YOLO v3 in darknet ⁹, MobileNetSSD v2 ¹⁰, and YOLO v4 ¹⁰) were used to perform image tagging for different classes of EOL images (flowers, insects, mammals/amphibians/reptiles/birds).

Frameworks differ in their speeds and accuracy: YOLO v4 is the fastest with intermediate accuracy, MobileNetSSD v2 is intermediate speed and accuracy, and YOLO v3 is somewhere in between ¹⁰ ⁶). The model with the best trade-off between speed and accuracy for each group was selected to generate final tagging data for EOL images.

For tagging, only the class of detected objects are kept and their locations are discarded. Object detection is used to identify plant-pollinator coocurrence, insect life stage, the presence of mammal, amphibian, reptile, or bird scat and/or footprints, and when a human (or body part, like 'hand') is present. These tags will allow users to search for features not already included in image metadata.

For plant-pollinator coocurrence, a model pre-trained on Google OpenImages ¹² was used. EOL images are run through the model and predictions for 'Butterfly', 'Insect', 'Beetle', 'Ant', 'Bat (Animal)', 'Bird', 'Bee', or 'Invertebrate' were kept and then converted to "pollinator present" during post-processing.
For insect life stages, a model pre-trained on Google OpenImages ¹² was used. EOL images are run through the model and predictions for 'Ant', 'Bee', 'Beetle', 'Butterfly', 'Dragonfly', 'Insect', 'Invertebrate', 'Moths and butterflies' were kept and then converted to "adult" during post-processing. Predictions for 'Caterpillar', 'Centipede', 'Worm' were converted to "juvenile" during post-processing.
For scat/footprint present, models were custom-trained to detect scat or footprints from EOL images, but never learned despite adjusting augmentation and model hyperparameters for many training sessions. Pipelines and datasets should be revisted in the future with different approaches.
For human present, a model pre-trained on Google OpenImages ¹² was used. EOL images are run through the model and predictions for 'Person' or any string containing 'Human' ('Body', 'Eye', 'Head', 'Hand', 'Foot', 'Face', 'Arm', 'Leg', 'Ear', 'Eye', 'Face', 'Nose', 'Beard') were kept and then converted to "human present" during post-processing.

➡️ 🌱 Click here to get started.

_{^{Object detection for image tagging results using pre-trained plant-pollinator coocurrence model displayed in a Google Colab Notebook. Image is hosted by Flickr (another flower - insect photo! by thart2009, licensed under CC BY 2.0).}}

Utils
This folder contains Colab Notebooks and Google Chrome developer console scripts with useful functions for building on existing EOL computer vision pipelines or for developing your own from scratch.

Getting Started

All files in this repository are run in Google Colab*. This repository is set up so that each notebook can be run as a standalone script. It is not necessary to clone the entire repository. Instead, you can navigate project sections (ie. GitHub folders) that are interesting and directly try the notebooks for yourself! All needed files and directories are set up within the notebook.

For additional details on steps below, see the project wiki.

New to Google Colab?
Google Colaboratory is "a free cloud service, based on Jupyter Notebooks for machine-learning education and research." Notebooks run >entirely on VMs in the cloud and links to you Google Drive for accessing files. This means no local software or library installs are >requried. If running locally and using a GPU, there are several softwares that need to be installed first and take up ~10 GB and a >few workarounds are required if running on a Windows OS. Working in the cloud eliminates these problems and makes it easier to >collaborate if multiple users are on different operating systems. If you prefer to use your local machine for object detection, refer to the Tensorflow Object Detection API Tutorial.*

References

¹ Ren et al. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence.
² Hui 2018. Object detection: speed and accuracy comparison (Faster R-CNN, R-FCN, SSD, FPN, RetinaNet and YOLOv3). Medium. 27 March 2018.
³ Redmon and Farhadi 2018. YOLOv3: An Incremental Improvement.
⁴ Lin et al. 2015. Microsoft COCO: Common Objects in Context.
⁵ Sandler et al. 2018. MobileNetV2: Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation. arXiv.
⁶ Szegedy et al. 2015. Rethinking the Inception Architecture for Computer Vision. arXiv.
⁷ Sharma 2019. Image Classification vs. Object Detection vs. Image Segmentation. Medium. 23 Feb 2020.
⁸ Goeau et al. 2016. Plant identification in an open-world (LifeCLEF 2016). CEUR Workshop Proceedings.
⁹ AlexeyAB 2020. Darknet. GitHub.
¹⁰ Bochkovskiy et al. 2020. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv.
¹¹ Liu et al. 2016. SSD: Single shot multibox detector. Lecture Notes in Computer Science.
¹² Krasin et al. 2017. Open images: A public dataset for large-scale multi-label and multi-class image classification. GitHub.

License

Code
Code in this repository is released under the MIT license. More information is available at the Open Source Initiative.
Images
All images used in this repository and notebooks contained therein are licensed under Creative Commons. EOL content is freely available to the public. More information about re-use of content hosted by EOL is available at EOL Terms of Use and EOL API Terms of Use. Specific attribution information for EOL images used for training and testing models is available in bundle URLs containing "breakdown_download" found within notebooks.

computer_vision_with_eol_images's People

Contributors

Stargazers

Watchers

Forkers

eliagbayani xl123321 sunray1 aiedward

computer_vision_with_eol_images's Issues

[Request]: Auto-restart runtime to update pip installed versions

Contact Details

[email protected]

Is your feature request related to a problem? Please describe.

I'm always frustrated when I have to wait for the Installs & Import code blocks to run, then have to restart the runtime to use the most recently installed package versions and wait for the same code blocks to run a second time before proceeding with the rest of the notebook.

Describe the solution you would like.

I would like the runtime to automatically restart after the first round of package installs. This way, I can click "run" and walk away for a few minutes before running cells the second time and proceeding with the rest of the notebook.

Describe alternatives you have considered.

I tried to automatically restart the Colab Runtime using a try/except solution proposed here in the excerpt below.

# Install requirements.txt
try:
    !pip3 -q install -r requirements.txt
except (ResolutionImpossible, ImportError, KeyError, ModuleNotFoundError):
    print('\n\n\n~~~\nStopping RUNTIME for a manual Colab restart. Run all code blocks again to use newly installed package versions.')
    os.kill(os.getpid(), 9)

However, the error thrown by pip is not getting caught by the try/except block and still prints out below.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed.

Which type of our computer vision pipelines are you using?

Object detection for image tagging

Which specific task from our computer vision pipelines are you using?

Plant pollinator, Insect life stages, Human present, Flower fruit

Filename

plant_poll_generate_tags_yolov3.ipynb

Version

YOLO v3 in Darknet

What browsers are you using?

Chrome

Any log output or other content that may be useful for developing the requested feature.

No response

Code of Conduct

I agree to follow this project's Code of Conduct

saved model doesn't work with newest cloned version of TF Objdet API

Contact Details

[email protected]

What happened?

When I use an EOL saved model with the latest cloned version of the Tensorflow Object Detection API, training of object detection for image cropping models doesn't work. The code still runs properly with the Tensorflow Object Detection API repository cloned on May 2021 using TF 2.8.

How did you try to fix it?

The training session does not output any error messages, but I confirmed that the frozen inference graph is loading correctly with the proper layers using code shown below modified from Frozen Graph Tensorflow.

PATH_TO_CKPT = 'tf_models/train_demo/rcnn/finetuned_model' + '/frozen_inference_graph.pb'
print("\nLoading trained model from: \n", PATH_TO_CKPT)
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.compat.v1.GraphDef()
    with tf.io.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        
def _imports_graph_def():
    tf.compat.v1.import_graph_def(od_graph_def, name='')

wrapped_import = tf.compat.v1.wrap_function(_imports_graph_def, [])
import_graph = wrapped_import.graph

# Print layers of saved graph
print("-" * 50)
print("Frozen model layers: ")
layers = [op.name for op in import_graph.get_operations()]
for layer in layers:
    print(layer)
print("-" * 50)

After creating a concrete function with the frozen graph using Tensorflow's TF2 Migration Guide, errors are still not thrown, but model outputs no longer have their original attribute names and do not look like they have values.

def run_detector_tf(image_url):
    image_np, im_h, im_w = url_to_image(image_url)
    with import_graph.as_default():
        with tf.compat.v1.Session(graph=import_graph) as sess:
            image_tensor = import_graph.get_tensor_by_name('image_tensor:0')
            detection_boxes = import_graph.get_tensor_by_name('detection_boxes:0')
            detection_scores = import_graph.get_tensor_by_name('detection_scores:0')
            detection_classes = import_graph.get_tensor_by_name('detection_classes:0')
            num_detections = import_graph.get_tensor_by_name('num_detections:0')

            # Create a concrete function by pruning the wrap_function (similar to sess.run)
            sess_run = wrapped_import.prune(feeds=image_tensor,
                                            fetches=[detection_boxes, detection_scores, 
                                                     detection_classes, num_detections])
            # Actual detection
            start_time = time.time()
            result = sess_run(tf.constant(image_np))
            end_time = time.time()
            result = {"detection_boxes": result[0], "detection_scores": result[1],
                      "detection_classes": result[2], "num_detections": result[3]}

Which type of our computer vision pipelines were you using?

Object detection for image cropping

Which specific task from our computer vision pipelines were you using?

Chiroptera

Filename

chiroptera_generate_crops_tf2.ipynb

Version

Version 2.8

What browsers are you seeing the problem on?

Chrome

Relevant log output

Example output for result["detection_boxes"]:
Tensor("StatefulPartitionedCall:0", dtype=float32)

Code of Conduct

I agree to follow this project's Code of Conduct

Training using model_main_tf2.py no longer works in TF2.8+

Contact Details

[email protected]

What happened?

When running the code to train the model using model_main_tf2.py, the model does not train. This used to work in Tensorflow 2.8, but after the backend upgrade to Tensorflow 2.9 this error started. After installing tf_slim, another error about tf.python.keras comes up. Even after downgrading to TF 2.8, the error is not fixed so I believe it is a dependency version causing the error.

#@title Train the model
# Note: You can change the number of epochs in code block above, then re-run to train longer
# Modified from https://github.com/RomRoc/objdet_train_tensorflow_colab/blob/master/objdet_custom_tf_colab.ipynb
matplotlib.use('Agg')
%cd $cwd

!python tf_models/models/research/object_detection/model_main_tf2.py \
    --alsologtostderr \
    --num_train_steps=$num_train_steps \
    --num_eval_steps=$num_eval_steps \
    --pipeline_config_path=$pipeline_config_path \
    --model_dir=$model_dir

How did you try to fix it?

I tried fixes here, here, and here but none of them worked. I also tried downgrading to TF 2.8, manually installing tf_slim, manuall editing resnet_v1.py from tf.python.keras.applications -> tf.keras.applications. Every time I try upgrading/downgrading the script that causes the problem, another traceback error is shown.

Which type of our computer vision pipelines were you using?

Object detection for image cropping

Which specific task from our computer vision pipelines were you using?

Chiroptera

Filename

chiroptera_train_tf2_ssd_rcnn.ipynb

Version

Version 2.8 and Version 2.9

What browsers are you seeing the problem on?

Chrome

Relevant log output

/content/drive/MyDrive/train/tf2/nov22test
Traceback (most recent call last):
  File "tf_models/models/research/object_detection/model_main_tf2.py", line 35, in <module>
    from object_detection import model_lib_v2
  File "tf_models/models/research/object_detection/model_lib_v2.py", line 29, in <module>
    from object_detection import eval_util
  File "tf_models/models/research/object_detection/eval_util.py", line 29, in <module>
    import tf_slim as slim
ModuleNotFoundError: No module named 'tf_slim'

Code of Conduct

I agree to follow this project's Code of Conduct

Objdet for image cropping with EOL saved model doesn't work after TF update

Last tested GitHub clone from Tensorflow Object Detection API that worked for running inference on pre-trained TF 1.x models is from May 2021 using TF 2.8. Code still runs using that cloned repository and TF 2.8 and/or 2.9, but does not if clone newest Object Detection API.

PATH_TO_CKPT = 'tf_models/train_demo/rcnn/finetuned_model' + '/frozen_inference_graph.pb'
print("\nLoading trained model from: \n", PATH_TO_CKPT)
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.compat.v1.GraphDef()
    with tf.io.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        
def _imports_graph_def():
    tf.compat.v1.import_graph_def(od_graph_def, name='')

wrapped_import = tf.compat.v1.wrap_function(_imports_graph_def, [])
import_graph = wrapped_import.graph

# Print layers of saved graph
print("-" * 50)
print("Frozen model layers: ")
layers = [op.name for op in import_graph.get_operations()]
for layer in layers:
    print(layer)
print("-" * 50)

def run_detector_tf(image_url):
    image_np, im_h, im_w = url_to_image(image_url)
    with import_graph.as_default():
        with tf.compat.v1.Session(graph=import_graph) as sess:
            image_tensor = import_graph.get_tensor_by_name('image_tensor:0')
            detection_boxes = import_graph.get_tensor_by_name('detection_boxes:0')
            detection_scores = import_graph.get_tensor_by_name('detection_scores:0')
            detection_classes = import_graph.get_tensor_by_name('detection_classes:0')
            num_detections = import_graph.get_tensor_by_name('num_detections:0')

            # Create a concrete function by pruning the wrap_function (similar to sess.run)
            sess_run = wrapped_import.prune(feeds=image_tensor,
                                            fetches=[detection_boxes, detection_scores, 
                                                     detection_classes, num_detections])
            # Actual detection
            start_time = time.time()
            result = sess_run(tf.constant(image_np))
            end_time = time.time()
            result = {"detection_boxes": result[0], "detection_scores": result[1],
                      "detection_classes": result[2], "num_detections": result[3]}

Example output for result["detection_boxes"]:
Tensor("StatefulPartitionedCall:0", dtype=float32)

Please find a way to make this code run with the latest clone of the Tensorflow Object Detection API in TF 2.9.

aubricot / computer_vision_with_eol_images Goto Github PK

computer_vision_with_eol_images's Introduction

Computer Vision with EOL v3 Images

Project Structure

Getting Started

References

License

computer_vision_with_eol_images's People

Contributors

Stargazers

Watchers

Forkers

computer_vision_with_eol_images's Issues

Contact Details

Is your feature request related to a problem? Please describe.

Describe the solution you would like.

Describe alternatives you have considered.

Which type of our computer vision pipelines are you using?

Which specific task from our computer vision pipelines are you using?

Filename

Version

What browsers are you using?

Any log output or other content that may be useful for developing the requested feature.

Code of Conduct

Contact Details

What happened?

How did you try to fix it?

Which type of our computer vision pipelines were you using?

Which specific task from our computer vision pipelines were you using?

Filename

Version

What browsers are you seeing the problem on?

Relevant log output

Code of Conduct

Contact Details

What happened?

How did you try to fix it?

Which type of our computer vision pipelines were you using?

Which specific task from our computer vision pipelines were you using?

Filename

Version

What browsers are you seeing the problem on?

Relevant log output

Code of Conduct

Recommend Projects

Recommend Topics

Recommend Org