Git Product home page Git Product logo

palaimon / ipyannotator Goto Github PK

View Code? Open in Web Editor NEW
35.0 35.0 7.0 142.46 MB

the infinitely hackable annotation framework

Home Page: https://palaimon.github.io/ipyannotator

License: Apache License 2.0

Dockerfile 0.54% Makefile 0.06% Python 28.11% Jupyter Notebook 70.87% Shell 0.30% HTML 0.03% Batchfile 0.09%
annotation-tool annotations hacktoberfest ipycanvas ipywidgets labeling labeling-tool nbdev voila

ipyannotator's People

Contributors

alexjoz avatar dependabot[bot] avatar ibayer avatar itepifanio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ipyannotator's Issues

[JOSS Review] Add automated tests in CI

For the JOSS documentation check of openjournals/joss-reviews#4480

Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?

There are no automated tests (there are GitHub Action workflows for building and deploying the docs, but not for running the tests across all versions of Python supported). It seems that CI with some basic coverage of the API would greatly improve things.

For v0.8.3 there are no instructions on how to run the tests in the README (but they were added in for a future version in PR #34). It is additionally unclear what the test coverage of running the tests in the notebooks is, which would be important to know.

Improve image labeling for large number of classes

Motivation

ipyannotator currently support image labeling. However, for data sets with a very large number of classes it's very difficult to
quickly match the image to the right class.

Showing an visual representation for all possible classes and there textual description right next to the image could considerable improve the process. Currently only a textual or a visual representation can be displayed.

Explore the current difficulties

  • run the notebook nbs/01b_tutorial_image_classification.ipynb with the data set dataset = 'oxford_flowers'

image

possible improvements:

  • make it easy to show the class name instead of the number (requires mapping from class id to class name)
  • show both visual and textual description
  • if the data set is already annotated, provide an option to take the visual example right from the data set

Add support for multiple bbox per image

Motivation

ipyannotator currently support single object annotation with bounding boxes (bbox). However often a image contains more then one object of interest that should be annotated.

Develop and Integrate Multi BBox Widget

Suggested steps:

  • explore 01c_tutorial_bbox.ipynb to see the current bbox implementation in action
  • extend 01_bbox_canvas.ipynb with a new class MultiBBoxCanvas which supports displaying and drawing of multiple bboxs
  • duplicate 04_bbox_annotator.ipynb as 04b_multi_bbox_annotator.ipynb and replace the BBoxCanvas with MultiBBoxCanvas

Cleanlab and Ipyannotator integration

Cleanlab is a tool that can detect label errors in datasets. The label error tutorial shows how the tool can be used to detect errors in Cifar-10 (and others datasets).

Create a new notebook that executes Cleanlab at the Cifar-10 dataset and use Ipyannotator to visualize all label errors detected by Cleanlab.

This Ipyannotator tutorial may help you with the Cifar-10 integration.

Improve Ipyannotator API message error

Ipyannotator has an API to use its previously defined annotators. The API uses a pair of input/output and when this pair it's not configured the API should throw a friendly exception for the user.

Right now when a pair it's not correctly configured the API prints a friendly message (Pair (Annotator Input type: CustomInput, Annotator Output type: NoOutput) is not supported!) but also throws a random exception AttributeError: 'NoneType' object has no attribute 'get_annotator' this behavior can be reproduced using the following code:

from ipyannotator.mltypes import Input, Output
from ipyannotator.annotator import Annotator

class CustomInput(Input):
    pass

custom_input = CustomInput()
annotator = Annotator(custom_input)
annotator.explore()

The expected behavior it's:

  • Ipyannotator throws a friendly custom exception (ex. PairUnsupported)
  • Don't throw the AttributeError

Settings API usage

Ipyannotator uses an API that contains a pair of Input/Output and a Settings class. The Settings contains parameters that are used for all annotators, but some of them are redundant.

This task tries to reduce the Settings class to three parameters:

class Settings:
    project_path: Path = Path('user_project')
    project_file: Optional[Path] = None
    result_dir: Optional[str] = None

The remaining parameters should be stored in Input/Output classes. This will avoid the following redundant code (from
01b_tutorial_image_classification.ipynb):

settings_ = get_settings(dataset)
settings_.project_file, settings_.image_dir
input_ = InputImage(image_dir=settings_.image_dir,
                    image_width=settings_.im_width,
                    image_height=settings_.im_height)

output_ = OutputImageLabel(label_dir=settings_.label_dir,
                           label_width=settings_.label_width,
                           label_height=settings_.label_height)

To do this the get_settings function needs to be refactored. A suggestion is to rename get_settings to get_api and return a tuple with input, output, and settings.

[JOSS Review] Python version support and clarification of framework vs. application

For the JOSS functionality check of openjournals/joss-reviews#4480

Installation: Does installation proceed as outlined in the documentation?

Yes, but not in a satisfactory manner. The instructions are to run

$ pip install ipyannotator

however, the unversioned documentation (so need to use the README) and setup.py via settings.ini claims that ipyannotator supports Python 3.7+

min_python = 3.7

but that is in opposition with the active metadata on PyPI which states that Python 3.8+ is required. If you use a Python 3.7 runtime to install ipyannotator you get an old version of v0.4.0 which is not a version that is part of the review.

Additionally, because of the way that Poetry locks down things the dependencies that are listed in the documentation are very severely out of date/inaccurate. Instead of the 4 listed dependencies ipyannotator actually incurs 15 dependencies. The additonal constraints that are imposed with Poetry's syntax means that the claim that Python 3.8+ is supported is technically true, but there are multiple instances in which wheels are not available for modern CPython. For example, for Python 3.10 the Poetry constraint on scikit-image of

scikit-image = "^0.18.3"

translates to scikit-image<0.19.0,>=0.18.3 and as can be seen from PyPI and

$ python -m pip index versions scikit-image
WARNING: pip index is currently an experimental command. It may be removed/changed in a future release without prior warning.
scikit-image (0.19.3)
Available versions: 0.19.3, 0.19.2, 0.19.1, 0.19.0, 0.18.3, 0.18.2, 0.18.1, 0.18.0, 0.17.2, 0.17.1, 0.16.2, 0.15.0, 0.14.5, 0.14.3, 0.14.2, 0.14.1, 0.14.0, 0.13.1, 0.13.0, 0.12.3, 0.12.2, 0.12.1, 0.12.0, 0.11.3, 0.11.2, 0.10.1, 0.10.0, 0.9.3, 0.9.1, 0.9.0, 0.8.2, 0.8.1, 0.8.0, 0.7.2
   INSTALLED: 0.18.3
   LATEST:    0.19.3

that leaves exactly one viable version, scikit-image v0.18.3 for which there is no Python 3.10 wheel available and so the wheel has to be built from the sdist.

So while it is possible to install ipyannotator on Python 3.10 it takes quite some time and results in a large collection of dependencies

$ python -m pip freeze | wc -l
105

that are extremely constrained. This is of course technically fine, but the lines of what is a "framework" and what is an "application" are starting to merge quite a bit as if ipyannotator is meant to be used alongside other software in an environment it is far too restricting.

The documentation around the installation should try to clarify this or make it clear that the versions of Python that the authors intend the software to be used with are more constrained then what is currently shown.

Create widget that supports drawing of polygons.

Motivation

ipyannotator currently support single object annotation with bounding boxes (bbox). However often a bounding box provides only
a very coarse localization of an instance. For tasks such as instance segmentation a sequence of (x, y) points is used to
define a polygon per object.

Polygon Annotation Widget

As a step towards supporting polygon annotations we need a widget that can display and accept user input for polygons.
Since polygon annotation can be seen as a generalization of bbox annotation a natural first step would be to use the
bbox canvas notebook 01_bbox_canvas.ipynb as blue print for a polygon widget.

Suggested steps.

  • Write code draw_polygon(...) to render a predefined polygon on an image.
  • Create a class class PolygoneCanvas(HBox, traitlets.HasTraits) that can capture user input to define a polygon.

[JOSS Review] Suggested revisions on code quality

For suggested revisions for openjournals/joss-reviews#4480:

In the code there are multiple instances of things like

from nbdev import *
$ git grep 'import \*'
dev_notes.md:from nbdev.export import *
nbs/00d_doc_utils.ipynb:    "from nbdev import *"
nbs/01_bbox_canvas.ipynb:    "from nbdev import *"
nbs/01_helpers.ipynb:    "from nbdev import *\n",
nbs/01a_datasets.ipynb:    "from nbdev import *"
nbs/01b_dataset_video.ipynb:    "from nbdev import *"
nbs/01b_tutorial_image_classification.ipynb:    "from nbdev import *"
nbs/01c_tutorial_bbox.ipynb:    "from nbdev import *"
nbs/01d_tutorial_video_annotator.ipynb:    "from nbdev import *"
nbs/02_navi_widget.ipynb:    "from nbdev import *\n",
nbs/02a_right_menu_widget.ipynb:    "from nbdev import *"
nbs/02b_grid_menu.ipynb:    "from nbdev import *"
nbs/03_storage.ipynb:    "from nbdev import *\n",
nbs/04_bbox_annotator.ipynb:    "from nbdev import *"
nbs/05_image_button.ipynb:    "from nbdev import *"
nbs/06_capture_annotator.ipynb:    "from nbdev import *"
nbs/07_im2im_annotator.ipynb:    "from nbdev import *"
nbs/13_datasets_legacy.ipynb:    "from nbdev import *\n",
nbs/15_coordinates_input.ipynb:    "from nbdev import *"
nbs/16_custom_buttons.ipynb:    "from nbdev import *"
nbs/18_bbox_trajectory.ipynb:    "from nbdev import *\n",
nbs/19_bbox_video_annotator.ipynb:    "from nbdev import *"
scripts/check_lint.sh:# F403 -> 'from module import *' used; unable to detect undefined names
scripts/check_lint.sh:# Cause : "from nbdev import *"

The use of import * should be avoided at all costs in Python code unless there is an extremely good reason to use it. For anything close to a library I do not think there are any.

Use point set to shape mapping to improve polygon annotation widget

Motivation

The polygon annotation widget nbs/01d_polygon_canvas.ipynb #5 (not yet included in the public github release) supports only sequential polygon annotation. Adding additional boundary points after creating an initial polygon depends on the creation order of the boundary points which is not very intuitive and also slow to execute.

Defining the polygon by a set of points instead of a list would make adding and removing of points much simpler.

Alpha-Concave Hull

In computational geometry, an alpha shape, or α-shape, is a family of piecewise linear simple curves in the Euclidean plane associated with the shape of a finite set of points. They were first defined by Edelsbrunner, Kirkpatrick & Seidel (1983). The alpha-shape associated with a set of points is a generalization of the concept of the convex hull, i.e. every convex hull is an alpha-shape but not every alpha shape is a convex hull.
source: https://en.wikipedia.org/wiki/Alpha_shape

Alpha-Concave Hull [0] is one such algorithm that could be interested for us.

Suggested steps.

  • literature review to find simple algorithm to associate shape with set of points.
  • poc implementation of algorithm in 01d_polygon_canvas.ipynb
  • create minimal sequence diagram https://plantuml.com/sequence-diagram to specify expected user iteration for creating and deleting points
  • refactor current polygon annotation widget to support the new algorithm

[0] Alpha-Concave Hull, a Generalization of Convex Hull

Note: This is an internal issue till the polygon annotation widget #5 is released.

[JOSS Review] Suggested revisions on paper quality

For suggested revisions for openjournals/joss-reviews#4480:

  • The paper is too long and JOSS papers are not meant to describe the API. Much of the section "A simple but flexible API to define annotation tasks" should be moved to the documentation, where it would be quite useful and welcome, though a simplified summary of the explore, create, improve workflow can remain in the paper. "Key Design Decisions" should also be cut and moved to a section of the documentation as well. It would probably be helpful to reread the Submitting a paper to JOSS instructions on what the paper should cover.

  • In the Acknowledgements section you mention

The authors acknowledge the financial support by the Federal Ministry for Digital and Transport of Germany under under the program mFUND for the project OS-VAT (project number 19F2160A).

Congratulations on the funding! If possible can you also include a reference or hyperlink to the project funding page? I'm not sure if there is another page with your actual propsal of if https://www.bmvi.de/SharedDocs/DE/Artikel/DG/mfund-projekte/os-vat.html is the actual government page for it.

Add support for class labeling of bbox

Motivation

ipyannotator currently support single object location with a bounding box (bbox). However annotating the type / class of the object is currently not supported.

Develop and Integrate Class Labeled BBox Widget

Suggested steps:

  • explore 01c_tutorial_bbox.ipynb to see the current bbox implementation in action
  • extend 01_bbox_canvas.ipynb with a new class LabeledBBoxCanvas which supports displaying and drawing of a class labeled bbox.
  • duplicate 04_bbox_annotator.ipynb as 04b_class_bbox_annotator.ipynb and replace the BBoxCanvas with LabeledBBoxCanvas

Make instructions on how to run tests explicit

This issue was created in response to this comment as part of the JOSS paper review.

#31 tries to improve the documentation of the tests.

Unfortunately the instructions remain ambiguous. Instructions on how to run tests should be as unambiguous and explicit as possible. I will explain what I mean by quoting from the improved instructions:


When installing the repository using poetry, all dev dependencies are installed by default.

Instructions on how to install via poetry only come later and include the --no-dev option – meaning that test dependencies are in fact not installed unless one infers that this option must be dropped.

When using pip for installation make sure to install the two dev dependencies pytest and ipytest, with the versions listed in pyproject.toml, manually:

pip install pytest
pip install ipytest

The instructions and the shown commands contradict each other since the latter will always install the latest version of the respective packages. Further, a version for pytest is not listed in the pyproject.toml file.


It is ok to be opinionated about how tests are supposed to be run for your package. If creating the correct environment is easier done using poetry then simply make that the recommended path, you can always provide instructions for alternative paths that can then be slightly less explicit. Here is my take at the instructions:

To run tests:

  1. Install poetry
  2. Create the test environment with $ poetry install
  3. Run tests by executing $ nbdev_test_nbs

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.