Git Product home page Git Product logo

Comments (6)

lessw2020 avatar lessw2020 commented on May 23, 2024 1

Hi @sleal-unity!
Thanks a ton for the fast response.
So, in further review, you are correct - algorithmically you are containing all 4 corner points, which is correct for the AABB type boxes (so this is a closable issue and apologies for the false alarm).

We do our own rotation in our training pipeline for real images, but the difference visually is that all the other rdt's are long and rectangular...this one is much more squarish than any other. Thus, we more typically see things like this:
rotated_rdt

A rectangle rotating within a square still fills it more efficiently in terms of corner points with less excess buffer, vs a square rotating within a square bloats it out and visually it was unexpected to see, such that I thought it was off on the labelling.

I also tested using minimal-rotation tonight, and ultimately that's not the reason it's being goofy on the sim to real holdout testing for this RDT as I had thought.

To your point re: other options - we have looked at switching to 4 point polygons as a workaround for rotated items, and in fact put in support for it in our codebase. But ultimately it was painful on the labelling side, so we didn't end up using it.

If you wanted to explore adding OBB or gliding vertex or 4 point polygons, that would be super interesting. I haven't seen OBB before but here is the paper on gliding vertex and explanation:
gliding-vertex
https://arxiv.org/abs/1911.09358

Anyway, pls go ahead and close this one as it is correctly computing based on AABB's.

from com.unity.perception.

lessw2020 avatar lessw2020 commented on May 23, 2024

*related but the outermost box (rdt) adjust quite well during the rotations and as a result we get a nice IOU score on the holdout. It's these internal boxes that don't seem to adjust well and thus create this excess buffering effect.

from com.unity.perception.

lessw2020 avatar lessw2020 commented on May 23, 2024

I should also add the internal bounding boxes are from invisible quads floating above the rdt, vs the outer rdt one is the prefab itself.

from com.unity.perception.

lessw2020 avatar lessw2020 commented on May 23, 2024

in some cases, there's also the reverse - the bounding box is too small and cuts off where it should not. In the image below you can see how the read windox box is too small and cuts off about 1/3 of the window it should contain. (tried to highlight with the red half circle to show the missing delta).
too_small_box

from com.unity.perception.

sleal-unity avatar sleal-unity commented on May 23, 2024

Hi Less!

Could you share a screenshot of your prefab, but with colored materials attached to the invisible quads? I'd like to get a better idea of how the labels on your RDTs are structured with respect to the underlying prefab.

Also, from my current understanding based on the images you've shared, it doesn't seem as though these axis aligned bounding boxes (AABB) are misbehaving per say, more that they're not well suited for accurately marking the labeled area when the object is rotated close to a 45 degree angle. This is a common problem with using AABBs when estimating object sizes for game engine tasks like volume estimation or collision detection.

Are you potentially interested in something like an oriented bounding box (OBB) or an arbitrary polygonal label?

In regards to oriented bounding boxes, here's a Wikipedia article on the subject that may help explain my mindset here. Specifically, this paragraph here:

In many applications the bounding box is aligned with the axes of the co-ordinate system, and it is then known as an axis-aligned bounding box (AABB). To distinguish the general case from an AABB, an arbitrary bounding box is sometimes called an oriented bounding box (OBB), or an OOBB when an existing object's local coordinate system is used. AABBs are much simpler to test for intersection than OBBs, but have the disadvantage that when the model is rotated they cannot be simply rotated with it, but need to be recomputed.

And here's a drawing I used to illustrate these two bounding box strategies when discussing this topic with a colleague of mine recently:

Bounding Box Types

I may be wildly off in my interpretation here, so I apologize in advance if I'm veering way off topic. Maybe you can provide some clarification on whether you think there is something wrong with the bound box calculation itself. If the bounding box labeler is just subtly missing the mark by a couple pixels, this would be a whole different issue than the bounding box type stuff I discussed above.

from com.unity.perception.

sleal-unity avatar sleal-unity commented on May 23, 2024

@lessw2020, thanks for the input as always!

Your comments about 4 point polygons reminds me of another similar conversation I had a month ago or so. I might need to investigate this more myself.

Anyways, if your labeling requirements change, just open up a new GitHub issue with the kind of functionality your looking for. It's always good to hear ideas directly from people using our tools!

from com.unity.perception.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.