Git Product home page Git Product logo

dslab's Introduction

Droplet Tracking

Repository for the 2022 Data Science Lab Project. Section Prerequisites describes how to set up environment. Section Basic Usage describes how to use the software.

Read the report here.

Prerequisites

Software

To make this project run, a python installation with certain packages is required. To that end, we suggest the user to use a conda virtual execution environment generated from the environment.yml file in this directory.

The general process for generating the execution environment from this environment.yml file is to first install conda, then open a command line in this directory and execute

   conda env create -f environment.yml

IMPORTANT: At the moment, the environment.yml file is set up to create an environment with name dsl. If you wish for a different name, you need to change it in the environment.yml file. Also, the installation directory, specified by prefix in environment.yml, is set up for when conda has been installed on the Macintosh Operating System via homebrew. If the reader is on a different operating system and does not recognize any of these words, delete the prefix field in environment.yml before generating the environment. Depending on the OS and other stuff, the user then needs to look into what is necessary for making the environment generation work.

Things needed for using Learned Features

  1. Analysing regular images

    • One should begin by downloading the model weights from https://drive.google.com/drive/folders/17FxnvlEciArXhHGG7Ws3jAbBjqVEsJc4?usp=sharing, and storing them in the appropriate folder. The folder experiments/0003/000/checkpoint_165000_.pt is the default, but any other folder can be chosen as long as it matches the checkpoint parameter in experiments/0003.toml.
    • In this scenario, no training is needed. Simply running the code via python main.py is enough. If embeddings have already been created (if a file called embeddings_{image_name}.csv is already present in data/03_features), then one can pass the flag -g in order not to re-generate the embeddings, and speed up the overall process. Not using embeddings is also possible, by passing the flag -ne.
  2. Using on another dataset

    • Here two distinct options are possible. If a training dataset is already present (composed of several droplet images), alter the file experiments/model.toml and place the training dataset and validation datasets paths in, respectively, train_dataset and val_dataset. If it is not, the model will be trained on the generated droplet images. While training, under experiments/{image_name} a new checkpoint will be created. After the code finishes running, in order to not retrain the model when analysing newer images, replace the path present on experiments/0003.toml, in the line checkpoint with the latest present checkpoint. Suppose, for instance that the image name is smallmovement1. Then there should be a folder experiments/smallmovement1/000/_____.pt. In this scenario, one should then replace checkpoint={previous value} with checkpoint=experiments/smallmovement1/000/_____.pt.
    • In order to train a new model, pass the flag -t to python main.py.

Entering the execution environment

To enter the conda execution environment generated from the environment.yml file, type

   conda activate <envname>

where <envname> is by default dsl if you havent changed it.

Basic Usage

After entering the execution environment, move the command line to this directory.

Then, place using the file explorer, the raw <imagename>.nd2 image into data/01_raw/. (This needs to be only done once).

Next, call

   python3 main.py "<imagename>.nd2"

This will execute the default tracking algorithm. To see the options one cann pass to the tracking algorithm, call

   python3 main.py -h

For example,

   python3 main.py "<imagename>.nd2" -ne

will execute the default tracking algorithm, but without deep embedding features.

The tracking results are dropped in data/05_results/results_<imagename>.csv.

To visualize the results, call

   python3 visualizer.py "<imagename>.nd2" "results_<imagename>.nd2"

more information on the visualizer is given below

Relevant Notes

  1. The droplet tracking algorithm works only with images that have similar statistics as the images supplied to the group. In particular, the images must be relatively focused and the resolution must be significantly high enough for details within the droplets to be visible. Additionally, the images must be provided in an .nd2 image format and the images must only contain data about the images of the different channeles across the different frames (just like the images provided to the group).

  2. To account for slight changes in the experiments which cause the droplets to have different diameters, one can adjust via the options --radius_min and radius_max the minimum and maximum radius of the droplets (in pixels) to be detected. The defaults are 12 and 25 respectively. Of course these bounds should not be very tight as due to noise there may be fluctuations in the measured radii. We suggest picking bounds with at least 2 to 3 pixels of slack. The defuault settings work fine for the images given to the group.

  3. There is an issue of image size and number of droplets in the image. We do not suggest to use too big images or images with 100s of thousands of droplets. The reason is that this will consume an ungodly amount of memory and simply crash the program or eat up a whole bunch of disk space. Images of the size that have been provided to the group (ca 4k x 4k pixels and 8k-10k droplets) work (takes about 10-20 minutes or so to do everything and takes up a bit more than 8 GB of RAM) but if it gets bigger than that, it becomes problematic (mainly due to memory issues). However, it is also not reasonable to use the very big images for another reason which would be, that nobody is going to be able to analyze the large amount of data produced by the algorithm in those cases (100k trajectories etc). Because in the end the output of the algorithm needs to be checked by a human anyways. In the case of small movement a human is still necessary in order to filter out the 1% of trajectories that are wrong while in the case of large movement, a human is necessary to detect which regions in the image contain useful and robust trackings. When we talk about images with 100k droplets, this is just not doable. The group suggest to use images with approximately 2k droplets and of dimensions 2k x 2k pixels (about 2 times 2 patches taken by the microscope camera).

Visualization

  1. The visualization tool can be executed by calling visualizer.py (example given below).

  2. The visualizer has some neat features. After executing the vsiualizer, a small window should pop up, like this:

If you click on the magnification glass on the bottom left, you can select a region on the image via left click-and-drag and focus on the selected part of the image. If you click on the home buttoin, you will return to the top-most view of the image. If you click on the left- and right-arrow buttons on the bottom left, you can go back and forth between "focus-levels" you had selected. The floppy disc icon allows you to store a screenshot of the currently displayed stuff. The symbol with the 4 arrows in each direction allows you to pan over the image. By clicking f you can switch between fulscreen mode and normal mode. The visualizer can be exited by pressing q. By pressing the buttons 0 to 9 one can overlay the brightfield images of the corresponding images of 0 up to 9. (Note that the first frame is frame 0, not frame 1).

  1. The visualizer also has some tools with which one can repair faulty trajectories and store a specific selection of droplets. The command line from where the visualizer was executed should also show information on how to use the tools. There are 3 main tools:

  2. Selection tool: If one does a left-click and draws a path/region with the mouse, the visualizer will compute which droplets are inside the region drawn (the first occurrence of the droplet is used, i.e. frame 1) by you and mark those droplets as droplets to "keep" (more on that later) and they will be marked orange. You can keep adding droplets with this region-selection tool anytime. When you are done selecting droplets, you can press 'c' which will take all those droplets marked as "keep" (the orange ones), and create a new csv file with those exclusively. All other droplets will not be present in this new csv file. The command line should give information on where this file is stored and on what its name is. Typically theis file will be stored under /data/05_results/results_<imagename>_<date>.csv. Here is an example of how selected droplets appear as orange:

  1. Swap tool: Sometimes the automatic tracking algorithm will get confused between two nearby droplets and mess up the tracking in one specific frame, while the tracking between all other frames is perfectly fine. Here is an example:

It is easy for a person to see that here the tracking algorithm just messed up in one single frame and that the movement of the droplets is simply a very slow horizontal movement and that the jump of the two trajectories is clearly wrong and that the jump should not happen. To allow a manual correction of these easy-to-see errors, the visualizer has the "swap tool". The swap tool works like this: First you press a to activate the tool (a will also again de-activate the tool. Check the command line for teh current status of the tool). One then selects two edges (which must represent droplet movements between the same two frames, e.g. frame 1 and 2) by clicking on them with the mouse. The two selected edges will be marked in green. If you are happy with the selection, you can then press 'enter' to confirm the swap between the two selected edges. The swap will bascially exchange the selected edges. After the swap is confirmed, the tool is deactivated automatically and must be re-activated by pressing a again. Here is an example:

Select the edges:

Press enter to confirm swap and repair the trajectories:

Of course, swapping is only allowed if the selected edges "happen at the same time", i.e. represnet movement between the same two frames. If one clicks on more than 2 edges, teh program will simply consider the last two clicked edges as selected edges. If one clicks on two edges that are not "at the same time", the program will assume the last selected edge has the correct time and will adjust the other edge to be at the same time.

  1. Cut tool: Sometimes trackings are just simply wrong but perhaps the tracking is wrong only between frame 0 and 1 while for all other frames the tracking is perfect. In such cases it makes sense to just cut the link between frame 0 and 1 for one single droplet, while keeping the links between all other frames intact. This way one can still make use of the correct tracking for the remaining frames without having to discard everything. An example of such a case is:

where the long line in the center is not possible due to ther droplets being in the way. This is where the cut tool comes into play. The cut tool is activated by pressing w. Then one can select an edge by left click, which will highlight the edge in red.

By pressing enter one can then confirm the edge to be cut, which will basically cut the trajectory at the selected edge and split the trajectory into two new, disjoint trajectories.

Afterwards one can use the selection tool described before to select the good trajectory and store it in a table.

  1. Example on how to execute the visualizer: python3 visualizer.py "smallMovement1.nd2" "results_smallmovement1.csv". Explaination: python3 visualizer.py tells the computer to exeucte the program visualizer.py by means of the programming language python3 (or simply Python). "smallMovement1.nd2" tells the program, which image to display in the overlay. "results_smallmovement1.csv" tells the program in which file the trackings computed by the algorithm are located. In particular this means, the visualizer can only be executed after one has executed main.py and gotten the results from the algorithm. Sometimes, the visualizer may bug and mouse clicks do not register or something like that. In this case, simply click outside of the window of the visualizer (such that the program goes "out-of-focus") and then click on it again. It should work after that.

dslab's People

Contributors

francescodadalt avatar antoinebasseto avatar filipe-m-cunha avatar samyakjain2512 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.