Git Product home page Git Product logo

cv-core's Introduction

Rust Computer Vision

Rust CV is a project to implement computer vision algorithms in Rust.

What is computer vision

Many people are familiar with covolutional neural networks and machine learning in computer vision, but computer vision is much more than that. One of the first things that Rust CV focused on was algorithms in the domain of Multiple-View Geometry (MVG). Today, Rust now has enough MVG algorithms to perform relatively simple camera tracking and odometry tasks. Weakness still exists within image processing and machine learning domains.

Goals

Here are some of the domains of computer vision that Rust CV intends to persue along with examples of the domain (not all algorithms below live within the Rust CV organization, and some of these may exist and are unknown):

To support computer vision tooling, the following will be implemented:

cv-core's People

Contributors

vadixidav avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cv-core's Issues

TriangulateObservances should take WorldPose, not CameraPose

By accident, TriangulateObservances passed tests because the implementation of TriangulateRelative incorrectly passed a WorldPose as a CameraPose type. This bug was discovered downstream in vslam-sandbox. This is a breaking change and will require a new breaking version.

UnscaledRelativeCameraPose should not wrap RelativeCameraPose

Although these will be renamed in #5, currently UnscaledRelativeCameraPose wraps RelativeCameraPose. Accidentally derefing it can cause errors, and it is not easy to access the isometry inside because you must get through two layers of newtypes. This should be separated into its own separate type, with a method to perform the conversion between the two.

Add set se(3) for Pose trait

Pose trait needs a way to set the pose in se(3). I added a way to move the pose in se(3) via a delta, but not get or set the pose as se(3). That needs to be added.

Consider whether the xyz component of homogeneous coordinates should always remain normalized

Currently it is possible for coodinates to vary in XYZ component freely. This causes no issues with any algorithms, but it does force those algorithms to normalize the XYZ component often to get the direction a ray travels, or to get the magnitude of the component. If the XYZ component were kept at a length of 1.0 at all times, it would simplify several algorithms, such as Levenberg-Marquardt (and jacobians in general), computing residuals (dot product could always be done with no concern at all for normalizing), checking chirality, and many others. Based on how often normalization happens, this might be helpful. Unfortunately, this does have the downside that normalization must be done every time the coordinate is updated, which might be costly (square root and division may take a while).

If this were to be done, it may be possible for a point to fail conversion. Due to this, an option will need to be returned, meaning that the From impl will have to be removed, which causes some code churn.

CameraPose, WorldPose, and RelativeCameraPose are bad names

CameraPose and WorldPose are bad names. It caused issue #4, which resulted in much lost time. It isn't clear what they are without reading the documentation.

These need to instead be named CameraToWorld, WorldToCamera, and CameraToCamera. This will put exactly what the thing is right into the name, so this issue doesn't come up again.

Points must use homogeneous projective coordinates

Currently, WorldPoint, CameraPoint, and the various triangulators all use 3d coordinates. This is not particularly numerically stable for coordinates which are far away in coordinate space. Additionally, triangulators natively output projective coordinates, and then they are divided by their W component to make the 3d coordinates in use in cv-core. This is not how it should be done. Everything should be changed to use projective coordinates.

Add a method to compute projection error to Bearing or CameraPoint

Justifications

Just earlier, I accidentally introduced a bug where I found the cosine distance of a bearing and a world point. Unfortunately, this was incorrect. I needed to transform the world point into the camera space first before computing the cosine distance for the purpose of filtering the point.

Also, a common thing to do is to compute this projection error as cosine distance via: 1.0 - bearing.dot(&view_point.bearing()).

Solution

We need to create a projection_error or similarly named method on the Bearing or CameraPoint. It can also be on both, so long as one simply calls the other.

Pose jacobians need to be transposed

Pose jacobians currently have each output as a column rather than a row. This needs to be changed to facilitate the chain rule downstream. This will be another breaking release.

Add Projective::from_point

In 0.13.0, the new Projective trait was made. Unfortunately, it was forgotten to add an API to convert 3d Point3 from nalgebra into the homogeneous projective coordinates. The workaround is to write point.to_homogeneous().into(). This works, but is a bit cumbersome. In the next version this issue should be fixed for convenience. It should be possible to add this in a bugfix version (since we are in 0.x.y) so that semver keeps the whole Rust CV ecosystem compatible (or I will have to release 20 or so crates again, which takes at least 2 hours).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.