Git Product home page Git Product logo

aubricot / computer_vision_with_eol_images Goto Github PK

View Code? Open in Web Editor NEW
8.0 2.0 4.0 198.75 MB

Testing different computer vision methods (object detection, image classification) to do customized, large-scale image processing for Encylopedia of Life images.

License: MIT License

Jupyter Notebook 98.12% Python 1.88%
image-classification object-detection eol encyclopedia-of-life computer-vision image-management

computer_vision_with_eol_images's Introduction

Computer Vision with EOL v3 Images

Testing different computer vision methods (object detection, image classification) to do customized, large-scale image processing for Encyclopedia of Life v3 database images (square, centered crops; image content tags; etc). Runs in Tensorflow 2 and Python 3.
Last updated 1 March 2023

Images a-c are hosted by Encyclopedia of Life (a. Choeronycteris mexicana, licensed under CC BY 2.0, b. Hippotion celerio, licensed under CC BY-NC-SA 3.0, c. Cuculus solitarius (left) and Cossypha caffra (right), licensed under CC BY-SA 2.0).

The Encyclopedia of Life (EOL) is an online biodiversity resource that seeks to provide information about all ~1.9 million species known to science. A goal for the latest version of EOL (v3) is to better leverage the older, less structured image content. To improve discoverability and display of EOL images, automated image processing pipelines that use computer vision with the goal of being scalable to millions of diverse images are being developed and tested.

Project Structure

Object detection for image cropping

Three object detection frameworks (Faster-RCNN Resnet 50 and either SSD/R-FCN/Faster-RCNN Inception v2 1 detection via the Tensorflow Object Detection API and YOLO v3 2 via Darkflow) were used to perform square cropping for EOL images of different groups of animals (birds, bats, butterflies & moths, beetles, frogs, carnivores, snakes & lizards) by using transfer learning and/or fine-tuning.

Frameworks differ in their speeds and accuracy: YOLO is the fastest but least accurate, while Faster RCNN is the slowest but most accurate, with MobileNet SSD and R-FCN falling somewhere in between 2 3 4. The model with the best trade-off between speed and accuracy for each group was selected to generate final cropping data for EOL images.

After detection, bounding boxes of detected animals are converted to square, centered cropping coordinates in order to standardize heterogenous image gallery displays.

  • For birds, pre-trained object detection models were used to detect birds.
  • For bats and butterflies & moths, object detection models were custom-trained to detect one class (either bats or butterflies & moths) using EOL user-generated cropping data (square coordinates around animal(s) of interest within each photo).
  • For beetles, frogs, carnivores and snakes & lizards, object detection models were custom-trained to detect all classes simultaneously using EOL user-generated cropping data.

➡️ 🌱 Click here to get started.

Demo video: Run your own images through the pre-trained EOL object detector in under 2 minutes.

Object detection results using trained multitaxa detector model displayed in a Google Colab Notebook. Image is hosted by Encyclopedia of Life (Lampropeltis californiae, licensed under CC BY-NC 4.0.

Classification for image tagging

Two classification frameworks (MobileNetSSD v2 11, Inception v3 5) were used to perform image tagging for different classes of EOL images (flowers, maps/labels/illustrations, image ratings) by using transfer learning and/or fine-tuning.

Frameworks differ in their speed and accuracy: MobileNetSSD v2 is faster, smaller, and less accurate and Inception v3 is slower, larger, and more accurate 5 6. The model with the best trade-off between speed and accuracy for each group was selected to generate final tagging data for EOL images.

While object detection includes classification and localization of the object of interest, image classification only includes the former step 7. Classification is used to identify images with flowers present, images of maps/collection labels/illustrations, and to generate image quality ratings. These tags will allow users to search for features not already included in image metadata.

  • For the flower classifier, models were trained to classify images into flower, fruit, entire, branch, stem or leaf using the PlantCLEF 2016 Image dataset as training data 8.
  • For the flower/fruit classifier, models were trained to classify images into flower/fruit or not flower/fruit using manually-sorted EOL images as training data.
  • For the image type classifier, models were trained to classify images into map, herbarium sheet, phylogeny, illustration, or none using Wikimedia commons, Flickr BHL, and EOL images as training data.
  • For the image rating classifier, models were trained to classify image quality rating classes 1-5 (worst to best) using EOL user generated training data.

➡️ 🌱 Click here to get started.

Image classification results using trained flower/fruit classification model displayed in a Google Colab Notebook. Image is hosted by Encyclopedia of Life (Leucopogon tenuicaulis, licensed under CC BY 3.0).

Object detection for image tagging

Three object detection frameworks (YOLO v3 in darknet 9, MobileNetSSD v2 10, and YOLO v4 10) were used to perform image tagging for different classes of EOL images (flowers, insects, mammals/amphibians/reptiles/birds).

Frameworks differ in their speeds and accuracy: YOLO v4 is the fastest with intermediate accuracy, MobileNetSSD v2 is intermediate speed and accuracy, and YOLO v3 is somewhere in between 10 6). The model with the best trade-off between speed and accuracy for each group was selected to generate final tagging data for EOL images.

For tagging, only the class of detected objects are kept and their locations are discarded. Object detection is used to identify plant-pollinator coocurrence, insect life stage, the presence of mammal, amphibian, reptile, or bird scat and/or footprints, and when a human (or body part, like 'hand') is present. These tags will allow users to search for features not already included in image metadata.

  • For plant-pollinator coocurrence, a model pre-trained on Google OpenImages 12 was used. EOL images are run through the model and predictions for 'Butterfly', 'Insect', 'Beetle', 'Ant', 'Bat (Animal)', 'Bird', 'Bee', or 'Invertebrate' were kept and then converted to "pollinator present" during post-processing.
  • For insect life stages, a model pre-trained on Google OpenImages 12 was used. EOL images are run through the model and predictions for 'Ant', 'Bee', 'Beetle', 'Butterfly', 'Dragonfly', 'Insect', 'Invertebrate', 'Moths and butterflies' were kept and then converted to "adult" during post-processing. Predictions for 'Caterpillar', 'Centipede', 'Worm' were converted to "juvenile" during post-processing.
  • For scat/footprint present, models were custom-trained to detect scat or footprints from EOL images, but never learned despite adjusting augmentation and model hyperparameters for many training sessions. Pipelines and datasets should be revisted in the future with different approaches.
  • For human present, a model pre-trained on Google OpenImages 12 was used. EOL images are run through the model and predictions for 'Person' or any string containing 'Human' ('Body', 'Eye', 'Head', 'Hand', 'Foot', 'Face', 'Arm', 'Leg', 'Ear', 'Eye', 'Face', 'Nose', 'Beard') were kept and then converted to "human present" during post-processing.

➡️ 🌱 Click here to get started.

Object detection for image tagging results using pre-trained plant-pollinator coocurrence model displayed in a Google Colab Notebook. Image is hosted by Flickr (another flower - insect photo! by thart2009, licensed under CC BY 2.0).

Utils
This folder contains Colab Notebooks and Google Chrome developer console scripts with useful functions for building on existing EOL computer vision pipelines or for developing your own from scratch.

Getting Started

All files in this repository are run in Google Colab*. This repository is set up so that each notebook can be run as a standalone script. It is not necessary to clone the entire repository. Instead, you can navigate project sections (ie. GitHub folders) that are interesting and directly try the notebooks for yourself! All needed files and directories are set up within the notebook.

For additional details on steps below, see the project wiki.

New to Google Colab?
Google Colaboratory is "a free cloud service, based on Jupyter Notebooks for machine-learning education and research." Notebooks run >entirely on VMs in the cloud and links to you Google Drive for accessing files. This means no local software or library installs are >requried. If running locally and using a GPU, there are several softwares that need to be installed first and take up ~10 GB and a >few workarounds are required if running on a Windows OS. Working in the cloud eliminates these problems and makes it easier to >collaborate if multiple users are on different operating systems. If you prefer to use your local machine for object detection, refer to the Tensorflow Object Detection API Tutorial.*

References

1Ren et al. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence.
2Hui 2018. Object detection: speed and accuracy comparison (Faster R-CNN, R-FCN, SSD, FPN, RetinaNet and YOLOv3). Medium. 27 March 2018.
3Redmon and Farhadi 2018. YOLOv3: An Incremental Improvement.
4Lin et al. 2015. Microsoft COCO: Common Objects in Context.
5Sandler et al. 2018. MobileNetV2: Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation. arXiv.
6Szegedy et al. 2015. Rethinking the Inception Architecture for Computer Vision. arXiv.
7Sharma 2019. Image Classification vs. Object Detection vs. Image Segmentation. Medium. 23 Feb 2020.
8Goeau et al. 2016. Plant identification in an open-world (LifeCLEF 2016). CEUR Workshop Proceedings.
9AlexeyAB 2020. Darknet. GitHub.
10Bochkovskiy et al. 2020. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv.
11Liu et al. 2016. SSD: Single shot multibox detector. Lecture Notes in Computer Science.
12Krasin et al. 2017. Open images: A public dataset for large-scale multi-label and multi-class image classification. GitHub.

License

Code
Code in this repository is released under the MIT license. More information is available at the Open Source Initiative.
Images
All images used in this repository and notebooks contained therein are licensed under Creative Commons. EOL content is freely available to the public. More information about re-use of content hosted by EOL is available at EOL Terms of Use and EOL API Terms of Use. Specific attribution information for EOL images used for training and testing models is available in bundle URLs containing "breakdown_download" found within notebooks.

computer_vision_with_eol_images's People

Contributors

aubricot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

computer_vision_with_eol_images's Issues

Objdet for image cropping with EOL saved model doesn't work after TF update

Last tested GitHub clone from Tensorflow Object Detection API that worked for running inference on pre-trained TF 1.x models is from May 2021 using TF 2.8. Code still runs using that cloned repository and TF 2.8 and/or 2.9, but does not if clone newest Object Detection API.

The training session does not output any error messages, but I confirmed that the frozen inference graph is loading correctly with the proper layers using code shown below modified from Frozen Graph Tensorflow.

PATH_TO_CKPT = 'tf_models/train_demo/rcnn/finetuned_model' + '/frozen_inference_graph.pb'
print("\nLoading trained model from: \n", PATH_TO_CKPT)
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.compat.v1.GraphDef()
    with tf.io.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        
def _imports_graph_def():
    tf.compat.v1.import_graph_def(od_graph_def, name='')

wrapped_import = tf.compat.v1.wrap_function(_imports_graph_def, [])
import_graph = wrapped_import.graph

# Print layers of saved graph
print("-" * 50)
print("Frozen model layers: ")
layers = [op.name for op in import_graph.get_operations()]
for layer in layers:
    print(layer)
print("-" * 50)

After creating a concrete function with the frozen graph using Tensorflow's TF2 Migration Guide, errors are still not thrown, but model outputs no longer have their original attribute names and do not look like they have values.

def run_detector_tf(image_url):
    image_np, im_h, im_w = url_to_image(image_url)
    with import_graph.as_default():
        with tf.compat.v1.Session(graph=import_graph) as sess:
            image_tensor = import_graph.get_tensor_by_name('image_tensor:0')
            detection_boxes = import_graph.get_tensor_by_name('detection_boxes:0')
            detection_scores = import_graph.get_tensor_by_name('detection_scores:0')
            detection_classes = import_graph.get_tensor_by_name('detection_classes:0')
            num_detections = import_graph.get_tensor_by_name('num_detections:0')

            # Create a concrete function by pruning the wrap_function (similar to sess.run)
            sess_run = wrapped_import.prune(feeds=image_tensor,
                                            fetches=[detection_boxes, detection_scores, 
                                                     detection_classes, num_detections])
            # Actual detection
            start_time = time.time()
            result = sess_run(tf.constant(image_np))
            end_time = time.time()
            result = {"detection_boxes": result[0], "detection_scores": result[1],
                      "detection_classes": result[2], "num_detections": result[3]}

Example output for result["detection_boxes"]:
Tensor("StatefulPartitionedCall:0", dtype=float32)

Please find a way to make this code run with the latest clone of the Tensorflow Object Detection API in TF 2.9.

saved model doesn't work with newest cloned version of TF Objdet API

Contact Details

[email protected]

What happened?

When I use an EOL saved model with the latest cloned version of the Tensorflow Object Detection API, training of object detection for image cropping models doesn't work. The code still runs properly with the Tensorflow Object Detection API repository cloned on May 2021 using TF 2.8.

How did you try to fix it?

The training session does not output any error messages, but I confirmed that the frozen inference graph is loading correctly with the proper layers using code shown below modified from Frozen Graph Tensorflow.

PATH_TO_CKPT = 'tf_models/train_demo/rcnn/finetuned_model' + '/frozen_inference_graph.pb'
print("\nLoading trained model from: \n", PATH_TO_CKPT)
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.compat.v1.GraphDef()
    with tf.io.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        
def _imports_graph_def():
    tf.compat.v1.import_graph_def(od_graph_def, name='')

wrapped_import = tf.compat.v1.wrap_function(_imports_graph_def, [])
import_graph = wrapped_import.graph

# Print layers of saved graph
print("-" * 50)
print("Frozen model layers: ")
layers = [op.name for op in import_graph.get_operations()]
for layer in layers:
    print(layer)
print("-" * 50)

After creating a concrete function with the frozen graph using Tensorflow's TF2 Migration Guide, errors are still not thrown, but model outputs no longer have their original attribute names and do not look like they have values.

def run_detector_tf(image_url):
    image_np, im_h, im_w = url_to_image(image_url)
    with import_graph.as_default():
        with tf.compat.v1.Session(graph=import_graph) as sess:
            image_tensor = import_graph.get_tensor_by_name('image_tensor:0')
            detection_boxes = import_graph.get_tensor_by_name('detection_boxes:0')
            detection_scores = import_graph.get_tensor_by_name('detection_scores:0')
            detection_classes = import_graph.get_tensor_by_name('detection_classes:0')
            num_detections = import_graph.get_tensor_by_name('num_detections:0')

            # Create a concrete function by pruning the wrap_function (similar to sess.run)
            sess_run = wrapped_import.prune(feeds=image_tensor,
                                            fetches=[detection_boxes, detection_scores, 
                                                     detection_classes, num_detections])
            # Actual detection
            start_time = time.time()
            result = sess_run(tf.constant(image_np))
            end_time = time.time()
            result = {"detection_boxes": result[0], "detection_scores": result[1],
                      "detection_classes": result[2], "num_detections": result[3]}

Which type of our computer vision pipelines were you using?

Object detection for image cropping

Which specific task from our computer vision pipelines were you using?

Chiroptera

Filename

chiroptera_generate_crops_tf2.ipynb

Version

Version 2.8

What browsers are you seeing the problem on?

Chrome

Relevant log output

Example output for result["detection_boxes"]:
Tensor("StatefulPartitionedCall:0", dtype=float32)

Code of Conduct

  • I agree to follow this project's Code of Conduct

Training using model_main_tf2.py no longer works in TF2.8+

Contact Details

[email protected]

What happened?

When running the code to train the model using model_main_tf2.py, the model does not train. This used to work in Tensorflow 2.8, but after the backend upgrade to Tensorflow 2.9 this error started. After installing tf_slim, another error about tf.python.keras comes up. Even after downgrading to TF 2.8, the error is not fixed so I believe it is a dependency version causing the error.

#@title Train the model
# Note: You can change the number of epochs in code block above, then re-run to train longer
# Modified from https://github.com/RomRoc/objdet_train_tensorflow_colab/blob/master/objdet_custom_tf_colab.ipynb
matplotlib.use('Agg')
%cd $cwd

!python tf_models/models/research/object_detection/model_main_tf2.py \
    --alsologtostderr \
    --num_train_steps=$num_train_steps \
    --num_eval_steps=$num_eval_steps \
    --pipeline_config_path=$pipeline_config_path \
    --model_dir=$model_dir 

How did you try to fix it?

I tried fixes here, here, and here but none of them worked. I also tried downgrading to TF 2.8, manually installing tf_slim, manuall editing resnet_v1.py from tf.python.keras.applications -> tf.keras.applications. Every time I try upgrading/downgrading the script that causes the problem, another traceback error is shown.

Which type of our computer vision pipelines were you using?

Object detection for image cropping

Which specific task from our computer vision pipelines were you using?

Chiroptera

Filename

chiroptera_train_tf2_ssd_rcnn.ipynb

Version

Version 2.8 and Version 2.9

What browsers are you seeing the problem on?

Chrome

Relevant log output

/content/drive/MyDrive/train/tf2/nov22test
Traceback (most recent call last):
  File "tf_models/models/research/object_detection/model_main_tf2.py", line 35, in <module>
    from object_detection import model_lib_v2
  File "tf_models/models/research/object_detection/model_lib_v2.py", line 29, in <module>
    from object_detection import eval_util
  File "tf_models/models/research/object_detection/eval_util.py", line 29, in <module>
    import tf_slim as slim
ModuleNotFoundError: No module named 'tf_slim'

Code of Conduct

  • I agree to follow this project's Code of Conduct

[Request]: Auto-restart runtime to update pip installed versions

Contact Details

[email protected]

Is your feature request related to a problem? Please describe.

I'm always frustrated when I have to wait for the Installs & Import code blocks to run, then have to restart the runtime to use the most recently installed package versions and wait for the same code blocks to run a second time before proceeding with the rest of the notebook.

Describe the solution you would like.

I would like the runtime to automatically restart after the first round of package installs. This way, I can click "run" and walk away for a few minutes before running cells the second time and proceeding with the rest of the notebook.

Describe alternatives you have considered.

I tried to automatically restart the Colab Runtime using a try/except solution proposed here in the excerpt below.

# Install requirements.txt
try:
    !pip3 -q install -r requirements.txt
except (ResolutionImpossible, ImportError, KeyError, ModuleNotFoundError):
    print('\n\n\n~~~\nStopping RUNTIME for a manual Colab restart. Run all code blocks again to use newly installed package versions.')
    os.kill(os.getpid(), 9)

However, the error thrown by pip is not getting caught by the try/except block and still prints out below.

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. 

Which type of our computer vision pipelines are you using?

Object detection for image tagging

Which specific task from our computer vision pipelines are you using?

Plant pollinator, Insect life stages, Human present, Flower fruit

Filename

plant_poll_generate_tags_yolov3.ipynb

Version

YOLO v3 in Darknet

What browsers are you using?

Chrome

Any log output or other content that may be useful for developing the requested feature.

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.