Git Product home page Git Product logo

oarriaga / paz Goto Github PK

View Code? Open in Web Editor NEW
605.0 14.0 99.0 9.76 MB

Hierarchical perception library in Python for pose estimation, object detection, instance segmentation, keypoint estimation, face recognition, etc.

Home Page: https://oarriaga.github.io/paz/

License: MIT License

Python 100.00%
pose-estimation object-detection keypoint-estimation emotion-recognition instance-segmentation face-recognition semantic-segmentation

paz's Introduction

(PAZ) Perception for Autonomous Systems

Unit testing PyPI upload Publish Website Downloads

Hierarchical perception library in Python.

Selected examples:

PAZ is used in the following examples (links to real-time demos and training scripts):

Probabilistic 2D keypoints 6D head-pose estimation Object detection
Emotion classifier 2D keypoint estimation Mask-RCNN (in-progress)
Semantic segmentation Hand pose estimation 2D Human pose estimation
3D keypoint discovery Hand closure detection 6D pose estimation
Implicit orientation Attention (STNs) Haar Cascade detector
Eigenfaces Prototypical Networks 3D Human pose estimation
MAML

All models can be re-trained with your own data (except for Mask-RCNN, we are working on it here).

Table of Contents

Installation

PAZ has only three dependencies: Tensorflow2.0, OpenCV and NumPy.

To install PAZ with pypi run:

pip install pypaz --user

Documentation

Full documentation can be found https://oarriaga.github.io/paz/.

Hierarchical APIs

PAZ can be used with three different API levels which are there to be helpful for the user's specific application.

High-level

Easy out-of-the-box prediction. For example, for detecting objects we can call the following pipeline:

from paz.applications import SSD512COCO

detect = SSD512COCO()

# apply directly to an image (numpy-array)
inferences = detect(image)

There are multiple high-level functions a.k.a. pipelines already implemented in PAZ here. Those functions are build using our mid-level API described now below.

Mid-level

While the high-level API is useful for quick applications, it might not be flexible enough for your specific purpose. Therefore, in PAZ we can build high-level functions using our a mid-level API.

Mid-level: Sequential

If your function is sequential you can construct a sequential function using SequentialProcessor. In the example below we create a data-augmentation pipeline:

from paz.abstract import SequentialProcessor
from paz import processors as pr

augment = SequentialProcessor()
augment.add(pr.RandomContrast())
augment.add(pr.RandomBrightness())
augment.add(pr.RandomSaturation())
augment.add(pr.RandomHue())

# you can now use this now as a normal function
image = augment(image)

You can also add any function not only those found in processors. For example we can pass a numpy function to our original data-augmentation pipeline:

augment.add(np.mean)

There are multiple functions a.k.a. Processors already implemented in PAZ here.

Using these processors we can build more complex pipelines e.g. data augmentation for object detection: pr.AugmentDetection

Mid-level: Explicit

Non-sequential pipelines can be also build by abstracting Processor. In the example below we build a emotion classifier from scratch using our high-level and mid-level functions.

from paz.applications import HaarCascadeFrontalFace, MiniXceptionFER
import paz.processors as pr

class EmotionDetector(pr.Processor):
    def __init__(self):
        super(EmotionDetector, self).__init__()
        self.detect = HaarCascadeFrontalFace(draw=False)
        self.crop = pr.CropBoxes2D()
        self.classify = MiniXceptionFER()
        self.draw = pr.DrawBoxes2D(self.classify.class_names)

    def call(self, image):
        boxes2D = self.detect(image)['boxes2D']
        cropped_images = self.crop(image, boxes2D)
        for cropped_image, box2D in zip(cropped_images, boxes2D):
            box2D.class_name = self.classify(cropped_image)['class_name']
        return self.draw(image, boxes2D)
        
detect = EmotionDetector()
# you can now apply it to an image (numpy array)
predictions = detect(image)

Processors allow us to easily compose, compress and extract away parameters of functions. However, most processors are build using our low-level API (backend) shown next.

Low-level

Mid-level processors are mostly built from small backend functions found in: boxes, cameras, images, keypoints and quaternions.

These functions can found in paz.backend:

from paz.backend import boxes, camera, image, keypoints, quaternion

For example, you can use them in your scripts to load or show images:

from paz.backend.image import load_image, show_image

image = load_image('my_image.png')
show_image(image)

Additional functionality

Models

The following models are implemented in PAZ and they can be trained with your own data:

Task (link to implementation) Model (link to paper)
Object detection SSD-300
Object detection SSD-512
Probabilistic keypoint est. Gaussian Mixture CNN
Detection and Segmentation MaskRCNN (in progress)
Keypoint estimation HRNet
Semantic segmentation U-NET
6D Pose estimation Pix2Pose
Implicit orientation AutoEncoder
Emotion classification MiniXception
Discovery of Keypoints KeypointNet
Keypoint estimation KeypointNet2D
Attention Spatial Transformers
Object detection HaarCascades
2D Human pose estimation HigherHRNet
3D Human pose estimation Simple Baseline
Hand pose estimation DetNet
Hand closure classification IKNet
Hand detection SSD512
Few-shot classification Prototypical Networks
Few-shot classification Model Agnostic Meta Learning (MAML)

Motivation

Even though there are multiple high-level computer vision libraries in different deep learning frameworks, I felt there was not a consolidated deep learning library for robot-perception in my framework of choice (Keras).

As a final remark, I would like to mention, that I feel that we might tend to forget the great effort and emotional status behind every (open-source) project. I feel it's easy to blurry a company name with the individuals behind their work, and we forget that there is someone feeling our criticism and our praise. Therefore, whatever good code you can find here, is all dedicated to the software-engineers and contributors of open-source projects like Pytorch, Tensorflow and Keras. You put your craft out there for all of us to use and appreciate, and we ought first to give you our thankful consideration.

Why the name PAZ?

  • The name PAZ satisfies it's theoretical definition by having it as an acronym for Perception for Autonomous Systems where the letter S is replaced for Z in order to indicate that for "System" we mean almost anything i.e. Z being a classical algebraic variable to indicate an unknown element.

Tests and coverage

Continuous integration is managed trough github actions using pytest. You can then check for the tests by running:

pytest tests

Test coverage can be checked using coverage. You can install coverage by calling: pip install coverage --user You can then check for the test coverage by running:

coverage run -m pytest tests/
coverage report -m

Citation

If you use PAZ please consider citating it. You can also find our paper here https://arxiv.org/abs/2010.14541.

@misc{arriaga2020perception,
      title={Perception for Autonomous Systems (PAZ)}, 
      author={Octavio Arriaga and Matias Valdenegro-Toro and Mohandass Muthuraja and Sushma Devaramani and Frank Kirchner},
      year={2020},
      eprint={2010.14541},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Funding

PAZ is currently developed in the Robotics Group of the University of Bremen, together with the Robotics Innovation Center of the German Research Center for Artificial Intelligence (DFKI) in Bremen. PAZ has been funded by the German Federal Ministry for Economic Affairs and Energy and the German Aerospace Center (DLR). PAZ been used and/or developed in the projects TransFIT and KiMMI-SF.

paz's People

Contributors

alexanderfabisch avatar amine789 avatar annaborn avatar cedric-cfk avatar deepanchakravarthipadmanabhan avatar dema-software-solutions avatar jaswanthbjk avatar manojkumarmuru avatar nodece avatar norbert-overpass avatar oarriaga avatar oscar-lima avatar planthaber avatar poornima2605 avatar praxidike97 avatar proneetsharma avatar shin-ka avatar sushmadg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paz's Issues

Opencv gui issue

The call cv2.namedWindow causes an issue:

QObject::moveToThread: Current thread (0xca9110) is not the object's thread (0x1256140).
Cannot move to target thread (0xca9110)

The problem comes with the installation of opencv-python, that is required by paz and defined in setup.py. Current solution is to uninstall opencv-python, build and install the latest version of opencv manually.

The same problem was described in opencv-python github

Add abstract class for cameras

Create abstract class for cameras e.g.

class LogitechC270(Camera):
init(self, intrinsics, distortion)

it should include methods such as:
calibrate()

Include refactored loss in training script using metrics

  1. Make unit tests for new loss function: localization, negative classification and positive classification
  2. Refactor training script such that is uses new loss
  3. Refactor training script such that is uses as metrics the localization, negative classification and positive classification functions.

Improvements for VOC loader

  1. A docstring for the class would be helpful:

paz/paz/datasets/voc.py

Lines 9 to 11 in 30b10a3

class VOC(Loader):
def __init__(self, path=None, split='train', class_names='all',
name='VOC2007', with_difficult_samples=True, evaluate=False):

  1. In this method:

paz/paz/datasets/voc.py

Lines 23 to 35 in 30b10a3

def load_data(self):
if self.name == 'VOC2007':
ground_truth_data = self._load_VOC(self.name, self.split)
if self.name == 'VOC2012':
ground_truth_data = self._load_VOC(self.name, self.split)
if isinstance(self.name, list):
if not isinstance(self.split, list):
raise Exception("'split' should also be a list")
if set(self.name).issubset(['VOC2007', 'VOC2012']):
data_A = self._load_VOC(self.name[0], self.split[0])
data_B = self._load_VOC(self.name[1], self.split[1])
ground_truth_data = data_A + data_B
return ground_truth_data

  • Line 25 and 27 are indentical.
  • What happens if self.name is not "VOC2007" or "VOC2012"?
  • What happens if self.name is ["VOC2007"]? I guess line 33 will cause an error.
  1. Why is this commented out?

paz/paz/datasets/voc.py

Lines 72 to 73 in 30b10a3

# if split not in ['train', 'val', 'trainval', 'test', 'all']:
# raise Exception('Invalid split name.')

  1. Some explanation for this would be helpful:

paz/paz/datasets/voc.py

Lines 118 to 120 in 30b10a3

if self.evaluate:
width = 1
height = 1

  1. Seems like this is incorrect:

and values are numpy arrays of shape (num_objects, 4 + num_classes)

The shape is (num_objects, 4 + 1) where +1 is the class index.

box_data.append([xmin, ymin, xmax, ymax, class_arg])

  1. dict(enumerate(class_names))?

self.arg_to_class = dict(zip(class_keys, self.class_names))

Proposition for ``Image`` message class for loading/processing in data managers

Currently we have a general-standard (oxymoron) for loading images i.e.

list of dictionaries with key 'image' and value either the image path or the numpy array

It might be better to have an abstract class for images that contains:

Fields:

  • path, width, height, normalized_width, normalized_height, values

Methods:

  • BGR, RGB, HSV?

Pros:

  • Having a concrete standard API for outputting images our algorithms
  • Providing the user an easy way to user our algorithms with their custom dataset e.g.
    "just make a list of this objects"

Cons:

  • Refactor all dataset loaders
  • Refactor all preprocessors that have as topic image since now you must access the values

One idea would be to extend a numpy array to have this fields.

Refactor `OutputSelector` for allowing change of output keys

Current OutputSelector processor outputs a dictionary of dictionaries of the form:

{'inputs': inputs, 'labels': labels}

where inputs and labels have the following structure:

inputs = {'topic_A': A_data, 'topic_B': B_data}

However, we the keys of this dictionary e.g. 'topic_A' and 'topic_B' can't be changed. Therefore, the name of the tensor must have the same name.

To make it more flexible, we can allow the user to provide a dictionary that allows changing the names of the topics to be in accordance the name of the model inputs and outputs.

Example fails: no "image.jpg"

Just tried to run this example:

python examples/tutorials/data_augmentation.py 
Traceback (most recent call last):
  File "examples/tutorials/data_augmentation.py", line 13, in <module>
    image = load_image('image.jpg')
  File "/home/afabisch/anaconda3/lib/python3.7/site-packages/Paz-0.1-py3.7.egg/paz/backend/image/opencv_image.py", line 69, in load_image
  File "/home/afabisch/anaconda3/lib/python3.7/site-packages/Paz-0.1-py3.7.egg/paz/backend/image/opencv_image.py", line 55, in convert_color_space
cv2.error: OpenCV(3.4.2) /tmp/build/80754af9/opencv-suite_1535558553474/work/modules/imgproc/src/color.hpp:253: error: (-215:Assertion failed) VScn::contains(scn) && VDcn::contains(dcn) && VDepth::contains(depth) in function 'CvtHelper'

I guess it is because "image.jpg" does not exist in the root directory.

Refactor ``VOC`` data loader

  1. Refactor such that it uses the numpy indices instead of matlab (BUG)
  2. Such that it uses get_annotation
  3. Simplify, reduce and test

Numpy dependency breaking on ExpandDims

it seems that the current use of expand_dims in ExpanDims requires at least numpy > 1.17

import numpy as np                                                                                                        
x = np.random.randn(48, 48)                                                                                               
np.expand_dims(x, [0, 3])

It seems to break in MiniXceptionFER

Add trained model into examples

Octavio, please add the trained model "fer2013_mini_XCEPTION.119-0.65.hdf5" (probably from oarriaga/face_classification repo ;) ) into the examples/face_classification folder. So the example demo.py will be complited and can be run without affort.
Thank you!

Change color-scheme of emotion classifier demo

Current colors of the emotion classifier are different between the bounding-box and the text. Furthermore, it would be good to revert back to the original color scheme e.g. happy -> yello, angry -> red, etc.

Documentation entry point

  • There is no hint in the readme how to build the documentation
  • The first page of the documentation is an empty page, some introduction would be nice

Add MAP evaluation callback

  1. Refactor back flag for using PASCAL VOC 07 metric
  2. Refactor the evaluation script
  3. Add MAP evaluation callback

Create evaluators for pose estimation

  1. Create abstract pose evaluations class
  2. Create unit tests
  3. Add them as metrics or callbacks (further discussion needed)
  4. Test them during a training procedure

Enhance sequential processors for easier input

Create an Input processor for accepting an input and wrapping it as a dictionary wit ha given topic: Therefore, you would not have to construct a whole dictionary for calling it e.g.

Now we have this:
outputs = inferencer({'image': image})

But we can have this:
outputs = inferencer(image)

Pros: It brings an easier API for user.
Cons: Needs refactoring for all inferences

Exceptions must derive from BaseException

Since I have no camera attached, this error is triggered:

Traceback (most recent call last):
  File "examples/haar_cascade_detectors/haar_cascade_detectors.py", line 47, in <module>
    player.run()
  File "/home/afabisch/anaconda3/lib/python3.7/site-packages/Paz-0.1-py3.7.egg/paz/backend/camera.py", line 119, in run
  File "/home/afabisch/anaconda3/lib/python3.7/site-packages/Paz-0.1-py3.7.egg/paz/backend/camera.py", line 103, in step
TypeError: exceptions must derive from BaseException

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.