Git Product home page Git Product logo

camii_dev's Introduction

Computational pipeline for Culturomics by Automated Microbiome Imaging and Isolation (CAMII)

This repo is based on the original CAMII pipeline, and where future development will take place. As of version 1.1, no major functionality is added, but speed is improved more than 10x by significant code refactoring.

Future plans include incorporation of hyperspectral images for diversity-optimized picking, image-to-taxonomy models, and end-to-end segmentation and classification.

Dependencies

The program runs on Python 3 and is tested with 3.10, third-party packages are:

  • NumPy, SciPy, pandas, Polars, scikit-learn, scikit-image, scikit-bio, Matplotlib, seaborn
  • opencv-python, Pillow
  • python-tsp
  • PyYAML, tqdm
  • rich

All packages should be easily installed with pip.

pip3 install numpy scipy pandas polars scikit-learn scikit-image scikit-bio matplotlib seaborn opencv-python pillow python-tsp pyyaml tqdm

Pipeline description

Step 0: Prepare your data

Images

In the current setup, the camera in our robot system takes images of rectanguar culture plates under two light conditions: red light from bottom and white light along the upper edge of the plate. Images are output in .bmp format. Put these image pairs in one directory.

Make sure that:

  • File names end with .bmp and the string before the first _ is plate name or barcode.
  • There are two and only two images for each plate, and the image under red light comes before the image under white light when sorted by file name. This is likely already the case since this is the order in which the robot takes pictures.

To proceed, convert the .bmp images into .png format with

./data_transform.py process_bmp -i <input_dir> -o <output_dir>

Images in the output directory will come in groups of 3:

  • <barcode>_gs_red.png, the picture taken with red light in grayscale.
  • <barcode>_rgb_red.png, the picture taken with red light.
  • <barcode>_rgb_white.png, the picture taken with white light.

We only convert the red light images to grayscale and this is what we use for colony detection.

Plate metadata

Prepare a csv file with these columns for each plate:

barcode group num_picks_group num_picks_plate

This is useful for picking a given number of colonies from a group of plates while limiting the number of colonies picked from each individual plate.

Config file

A .yaml file specifying arguments for the pipeline.

Calibration parameter

Based on a reference panel of CAMII pictures, we calculate calibration parameters to account for non-uniform illumination and other artifacts. In the current implmentation, we simply do this by dividing the average value of each pixel by the average over the entire image. Input images in the subsequent steps are then divided by these calibration parameters, so that pixels that typically have extreme values are brought closer to the mean, i.e., background is removed.

./calc_calib_params.py -i <input_dir_with_reference_bmp_pairs> -o <output_dir> -c <config_file>

You can find in this repo pre-computed calibration parameters at ./test_data/parameters/calib_parameter.npz.

Picking coordinate correction parameter:

The robot migh not picking where we want, due to lens distortion and other systematic error. We can fit a linear model to correct for this. In current implementation the corrected picking coordinates (x' and y') are fitted by:

  • x' = x - (ax2 * x^2 + ax1 * x + axy * y + bx)
  • y' = y - (ay2 * y^2 + ay1 * y + ayx * x + by)

where x and y on the right of the equal sign are the picking coordinates output by the last step, ax2, ax1, axy, bx, ay2, ay1, ayx, by are fitted parameters.

Fitting such a model would require actually experimenting with the robot, but a .json file with pre-computed model parameters is provided in ./test_data/parameters/correction_params.json.

Step 1: Colony detection

Microbial colonies are detected by the canonical pipeline of image processing: background subtraction, thresholding, contour detection, and contour filtering.

./detect_colonies.py -i <input_dir_with_png_pairs> -o <output_dir> -b <calibration_parameter_npz> -c <config_file>

When input path is a .png image, colony detection is performed for this single image, and in the output directory, these output will be generated:

  • <barcode>_annot.json, colony segmentation in coco format.
  • <barcode>_image_contour.jpg, segmentation contours overlaid on the white light image.
  • <barcode>_image_gray_contour.jpg, segmentation contours overlaid on the red light image in grayscale.
  • <barcode>_metadata.csv, metadata for each contour (i.e., putative colony).

When input path is a directory, colony detection is performed for all .png images in the directory, and the same list of output files will be generated for each image in the directory.

Step 2: Colony selection (intial stage)

A subset of all detected colonies will be selected for picking, under constraint set in the plate metadata. We start by selecting num_picks_group colonies (in the metadata) from each group using farthest point algorithm. This algorithms randomly choose num_picks_group colonies and iteratively refine this set until convergence by replacing a colony the current set by a colony that is farthest away from the current set. When doing replacement, the number of colonies selected for each plate is recorded to not exceed the num_picks_plate (in the metadata) limit for each plate.

./select_colonies.py init -p <directory_with_png_images> -i <directory_with_segmentations> -o <output_dir> -m <path_to_metadata> -c <path_to_config>

In the output directory, these output will be generated for each plate:

  • <barcode>_annot_init.json, colony segmentation (after initial selectio) in coco format.
  • <barcode>_metadata_init.json, metadata for each selected contour (i.e., putative colony).
  • <barcode>_gray_contour_init.jpg, segmentation contours (after initial selection) overlaid on the red light image in grayscale.

Step 3: Manual inspection

In this step you need to remove unwanted colonies selected in the first step yourself. I suggest quick online tool makesense.ai or Darwin V7 Lab since you could import and export coco annotations.

After manual twicking, output segmentation in coco format into the same output directory and name it as <barcode>_annot_init_post.json.

Step 4: Colony selection (post stage)

After a few colonies are labeled as bad colonies, the constraints set in the metadata are no longer satisfied. In this step we run a simpler fartherst point algorithm to make up for the lost colonies.

./select_colonies.py post -p <directory_with_png_images> -i <input_dir> -m <path_to_metadata> -s init

In the output directory, these output will be generated for each plate:

  • <barcode>_annot_final.json, colony segmentation (after post selection) in coco format.
  • <barcode>_metadata_final.json, metadata for each selected contour (i.e., putative colony).
  • <barcode>_gray_contour_final.jpg, segmentation contours (after post selection) overlaid on the red light image in grayscale.

Step 5: Go back and forth

If you want you can go back to graphical user interface again to exclude bad colonies. If you do this, store modifed segmentation annotation as <barcode>_annot_final_post.json in the output directory and run the post selection step again (but make sure to specify -s final and the output from the last step will be overwritten).

Step 6: Finalize colony selection

After you are good with the colony selection on each plate, finalize the selection by generating a few vislization and run Travelling Salesman Problem (TSP) to find the optimal pick order that minimizes robot movement.

./select_colonies.py final -p <directory_with_png_images> -i <input_dir_with_results_from_last_step> -o <output_dir> -m <path_to_metadata> -t [heuristic|exact]

In the output directory, these output will be generated for each plate:

  • <barcode>_gray_contour.jpg, segmentation contours overlaid on the red light image in grayscale.
  • <barcode>_metadata.json, metadata for each selected contour (i.e., putative colony).
  • <barcode>_picking.json, picking coordinates of selected colonies in CSV format, first column is x coordinate and second column is y coordinate.
  • <barcode>_rgb_red_contour.jpg, segmentation contours overlaid on the red light image.
  • <barcode>_rgb_white_contour.jpg, segmentation contours overlaid on the white light image.

Just note that exact TSP optimization might take forever if you have hundreds of colonies to in a plate.

Step 7 (final step, well done): Coordinate correction

./correct_coords.py -i <input_dir_with_picking_json_from_last_step> -p <correction_parameter_json>

In the directory, we do correction for coordinates in all *_picking.json files and store corrected coordiantes as <barcode>_Coordinates.csv, named like this because of the requirement by the colony picking robot.

camii_dev's People

Contributors

whatever60 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.