Git Product home page Git Product logo

hit-test's People

Contributors

avadacatavra avatar bialpio avatar coprez avatar cwilso avatar ddorwin avatar dontcallmedom avatar drumath2237 avatar foolip avatar himorin avatar johnpallett avatar jsantell avatar lincolnfrog avatar manishearth avatar mounirlamouri avatar nellwaliczek avatar takahirox avatar toji avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hit-test's Issues

Additional Hit Test Use Cases?

The Hit-Test explainer lists 2 use cases for the Hit-Test API

  • Show a reticle that appears to track the real world surfaces at which the device or controller is pointed.
  • Place a virtual object in the real world

We should consider (or affirmatively reject) some other user cases. Hit testing / ray casting can be used for a variety of things in a 3D system.

For example, virtual object visibility testing to determine if a virtual object is occluded from the POV of the camera. Occlusion came up in #13 but that was focused on the hit testing of the real world objects.

An application could try to use the hit-test API to determine if a virtual object that it is drawing in the scene should be visible or not. This would be done by firing a ray from the camera towards the current position of a virtual object and seeing if there is a hit of some real world geometry that is closer than the virtual objects known position.

The result will be crude for sure and it may require multiple rays being cast at different parts of the virtual object to get a better estimate but it is something people could try to do.

Is this a use case we want to support? The answer has implications for things like rate limiting #6.

Other uses for hit testing could involve something like virtual game objects that need some kind of “AI” to act in the scene. The code for them may want to perform a hit test to see if the object can “see” another object or do some kind of path planning around the real objects in the world.

Many of these use cases would work better if the application has access to the full world geometry but that won’t stop people from trying using just hit test.

Hit-test from arbitrary screen location (without user input)?

Does the hit-test API currently support generating a ray at an arbitrary screen location, without explicit user interaction, on a touchscreen device (i.e. transient input source)?

The reason I'm asking, is because I was thinking about experimenting with an app where an image/object would be detected in the camera view. And then perform a hit-test from the location of the object on the screen, to get the pose in 3D space. So this would be somewhat similar to augmented images of ARCore, or the Experimenting with computer-vision in WebXR proof-of-concept.

This use case would probably be covered by the computer-vision feature proposal. But I assume that it isn't getting out of incubation in the near future.

Maybe I should somehow just get the user to "select" the object/image on the screen. Or is there another recommended approach I could take?

Interoperability and compatibility - browser vendors & web developers support

This should sound familiar:
I'm currently preparing to send out an Intent to Ship for WebXR Hit Test Module so I'd like to poll for other vendors' support for the module. @thetuvix, @Manishearth, @grorg, @rcabanier - can you share your opinion? I'm aware of 2 outstanding issues that we should agree on (#66 & #67) prior to sending an I2S - we're currently planning on marking the concept of entityTypes as unstable / at risk and not including it in the initial version of the spec (as mentioned in issue #66).

Web developers' opinion about this module is also welcome!

Current XRRay construction semantics incorrect

(Previously #83)

Currently the XRRay vector constructor is specified as

 [Constructor(optional DOMPointInit origin, optional DOMPointInit direction)]

however, this is no longer allowed by the current WebIDL spec, it needs to be

   constructor(optional DOMPointInit origin = {}, optional DOMPointInit direction = {}); 

(whatwg/webidl#750 forces us to default to {})

Given that whatwg/webidl#750 forces optional dictionary arguments to specify a default, we can no longer distinguish between "direction was not specified" and "direction was specified as {0, 0, 0, 1}".

Unfortunately, {0, 0, 0, 1} is not a valid direction value! The w coordinate should be zero; and ideally we should throw for nonzero w (which is what Chrome does, despite being unspecced). But we can't tell the difference between "direction unspecified" and "direction is {0, 0, 0, 1}", so this would end up throwing for new XRRay({...}) with an unspecified direction.

We'd like to default it to the -Z axis instead. We should use a new dictionary here. We should also potentially handle the w coordinate in origin instead of ignoring it.

Describe the meaning of multiple hit results

from @kearwood on the proposal repo:

We should describe why there would be multiple hits, and what order they are returned in. For example, we could say that the XRHitResult's are returned in the order of depth or that they are in order of confidence.

When enabling two entityTypes, is one surface expected to provide 2 hit test results?

When a developer enables both the "plane" and "mesh" entityType, what should they expect when hit-testing against a relatively flat surface?

  • Would they get two XRHitTestResult instances for each surface, one for the surface's plane geometry and one for the same surface's mesh geometry?
    • If so, is it random whether the site sees the plane or the mesh first in the list, based on whether the mesh geometry happens to dip above or below the idealized plane along that particular ray?
    • Note that there is no output entityType on an XRHitTestResult, and so an app can only pick the first result's position and normal. If there is a material difference in normal stability for "plane" vs. "mesh" hit test results, the placed object may judder unexpectedly as the user scans the ray across areas where the mesh dips above and below the plane.
  • Would they get one XRHitTestResult instance for each surface, with the UA hiding the mesh collision at points where the same surface has a higher-quality "plane" normal to offer?
    • In this approach, the site's request for "mesh" entities is primary an opt-in to hits against additional curved surfaces that don't have planes.
    • If UAs go this way, is there a difference between requesting ["mesh"] entities vs. requesting ["mesh", "plane"] entities? Does a single entityType member or a planesOnly bool (as discussed in #66) make more sense then?

If different UAs diverge in the path they choose here, it will likely cause non-conformance across devices for sites that request ["mesh", "plane"] hits. We should be more prescriptive in the spec on how many hits at most are expected per real-world surface.

What happens if you don't attach a .then() to a hit-test promise until much later

A developer could potentially do something like this:

var promise = xrSession.requestHitTest(...);
window.setTimeout(() => { promise.then(...); }, 99999);

What happens here? The result is totally invalid at the point when it resolves. Can the promise be invalidated after the RAF for which it is valid is issued? This gets especially dangerous if we add the XRFrame object to the promise resolution of hit-test, as that frame will be potentially invalid at the late-attached "then".

Is there a better term than "TransientInput" for method names?

For the recently merged transient hit test PR, the design seems like it holds together at first glance. The primary unfortunate aspect to me there is that we'd be introducing methods like requestHitTestSourceForTransientInput that have ...TransientInput in the name. Transient input is a concept defined in the core WebXR spec, but the term transient never appears in the actual API surface anywhere yet. It feels odd as someone approaching the hit-test API to only encounter the term TransientInput here.

It seems worth some bikeshedding to see if we can align those "transient" member names closer to the core API somehow. Another option would be to tweak the design to more directly reference core API members. A random idea that comes to mind: switching the method to requestHitTestSourceForInputProfile to build on the profiles list that already manifests in the core API surface, returning hits whether the specified profile is transient or not.

Originally posted by @thetuvix in #74 (comment)

need a supported way of checking for hit-test capability

Right now, there is no way to check if a session supports hit-testing. Google's model-viewer takes the approach of checking for the existence of a method on XRSession:

export const HAS_WEBXR_DEVICE_API = navigator.xr != null &&
    self.XRSession != null && navigator.xr.isSessionSupported != null;

export const HAS_WEBXR_HIT_TEST_API =
    HAS_WEBXR_DEVICE_API && self.XRSession!.prototype.requestHitTestSource;

This is a fragile approach. A browser could chose to NOT expose these methods until a session is created and approved by the user.

Instead, each of these extensions should add a feature to the required/option feature sets in WebXR, and using those should be the only supported way of checking for the existence of a feature.

Can "point" just be subsumed into "mesh"?

At a high level, this hit-test module is meant to enable sites to raycast against real-world geometry without dealing with the divergences around specific tracking technologies that may leak through in the full real-world geometry module.

It makes sense that sites would still want to reason about hit-testing against planes vs. hit-testing against meshes, since they have different characteristics:

  • "plane" hit tests will provide a more stable normal for placing a medium-sized object
  • "mesh" hit tests will provide a more locally-accurate normal for placing a small object, or when you know your users will be placing the object on an uneven surface

However, the "mesh" use case seems to apply equally to "point" hit-tests - for both, the app is choosing to hit-test against the full contoured surface of the object rather than an idealized plane. In addition, while planes and meshes are real-world concepts, feature points are an implementation details of some of today's tracking technologies. Will some future LIDAR-based headset need to simulate feature points to keep today's sites happy?

Especially for this explicitly-abstracted hit-test API, it would seem that we can get the full developer capability here (deciding on idealized planes vs. full contoured surfaces) without tying sites to today's specific tracking technologies by just subsuming feature-point hit-testing into the "mesh" XRHitTestTrackableType.

If we feel that "mesh" itself would be too specific on devices that use feature points but don't calculate mesh, we could just replace the entityTypes array with a planesOnly bool. When true, the UA only intersects planes - when false, the UA can also intersect feature points, meshes, or any other way the device has for reasoning about full contoured surfaces.

How should XRRay.matrix be initialized for XRRay(transform)?

In XRRay section, it isn't mentioned how XRRay.matrix should be initialized for XRRay(transform) while it's mentioned for XRRay(origin, direction) "Initialize ray’s matrix to null".

How should it be initialized for XRRay(transform)? Copy transform.matrix? Or set null and lazily generate as XRRay(origin, direction) does?

Clarify promise resolution possibilities

In the current Chrome implementation, upon an unsuccessful hit, XRSession.prototype.requestHitTest() either returns a promise resolving to an empty array, or, due to an issue, may reject the promise.

We should clarify in the spec what happens when no hits are found (I assume empty arrays), and what can cause the promise to reject (I imagine if the platform doesn't support hit tests).

Support batching of requests

This could be an important feature for several reasons:

  • ensuring that related hit-tests get resolved on the same frame
  • performance improvement, especially for a synchronous API

Add the object hit to the XRHitResult structure

In order to support more intelligent logic around hit results and anchors, it would be very useful to have the object that was hit (plane, mesh, etc.) included in the hit structure. This task requires some definition of world understanding elements and collaboration with the anchor API in order to make sense.

There’s no way to determine relative to what the XRHitTestResult’s transform is w/o knowing exactly how it was obtained

The hit testing explainer describes that XRHitTestResult’s transform can be expressed relative to either XRHitTestOptions.space, or relative to optional XRSpace relativeTo passed into XRFrame.getHitTestResults(). It'll make it more difficult to use the XRHitTestResult (for example in helper helper methods) as it would require the API user to also pass relevant XRSpace.

Additionally, it seems to me that XRHitTestResult deviates from pattern of not handing out XRPoses directly (they can only be obtained through a call to XRFrame.getPose(to_space, from_space)) that is currently present in WebXR - is that intentional? Let me know if you think it should be a separate issue (“XRHitTestResult should contain XRSpace”) - if so, this entire issue can be closed. If we decide not to follow the pattern in this particular case, we might want to consider changing hit test result’s attribute XRRigidTransform transform into attribute XRPose pose (as per spec, XRPose “describes a position and orientation in space relative to an XRSpace.” - sounds exactly like what is actually contained by XRHitTestResult).

Hit test results are fundamentally rays, not poses

A hit test result is fundamentally a ray with the origin being the point of intersection, and direction being perpendicular to the plane being intersected with. This is how, for example, Unity represents it.

We're representing it as an XRRigidTransform/XRPose that produces such a ray, i.e. an XRRigidTransform which is capable of transforming the y-axis ray to the desired ray. However, such a transform is not unique (it can put the x and z transforms anywhere), and it seems weird to provide this information in such a form. It would be better if we directly produced a ray -- it's always possible to get a rigid transform out of a ray if desired.

ARCore defines it more rigidly, and the tests follow this definition. Note that the only thing actually dependent on the planes is the intersection point and normal, everything else can be computed given the point, normal, and presupplied input ray.

To me the use of poses/rigid transforms here doesn't make sense, it's fundamentally not a pose. We can attach additional made-up axes to it (the way ARCore does) to make it into a pose, but I don't see the practical use of doing this. The user will likely wish to consume this information in ray form as well, I mostly see users using the position, and if they actually use the orientation it would be to obtain a direction vector.

I suppose using it as a pose can be useful for positioning little indicator objects on the place the hit test actually occurred, but as mentioned before the user has enough information to calculate this for themselves, and this doesn't convince me that an arbitrary choice of X and Z axes is useful.

The status quo is that the tests don't match the API as specced, and that must be fixed one way or the other at the very least.

I see three paths forward for this:

  • Turn getPose() into getRay(): IMO this is ideologically the best option, however it's a major breaking change for a somewhat-shipped API and probably practically the worst option.
  • More rigidly define the orientation of the hit test result transform: I ... don't like this. This is spending compute cycles on doing math that's not actually going to be used, and we'd have to pick some arbitrary convention.
  • Explicitly mention that this transform is not fully defined. Update the tests to be okay with that, i.e. by checking the position and "y axis rotated by orientation" directly.

Thoughts? Is there a purpose for the result being a full pose?

cc @bialpio @toji

(also cc @thetuvix to inform OpenXR hit testing API choices)

Ray parameter as full matrix?

To be consistent with the rest of WebXR, the ray should be specified as a full 4x4 matrix rather than a position/direction vector pair. This is technically over-specified for a ray and shearing a ray is not something that would be useful, however the argument could be made that it is more compatible with the math types that are acquired from other systems so you can just feed parameters in directly without having to generate a origin/direction.

Hit-test capability detection api ?

Is there hit-test capability detection API available/any plans to add? For eg., supportsHittest() before using requestHitTest() on XRSession.

Order of hit tests and RAF

Per the current spec, it currently implies the order of requesting hit tests matters relatively to queuing up the next frame:

Ideally, the system would resolve hit-tests before the next frame as long as they were called prior to the request for that frame (i.e. xrSession.requestAnimationFrame(...)).

Is this intentional, such that calling a requestHitTest before and after calling requestAnimationFrame has a difference, assuming there are no other bottlenecks in the current environment? If so, this seems unintuitive, and different from other RAF APIs, and doesn't seem to have any implementation constraints. Or is this a documentation clarity issue?

  session.requestHitTest(...); // Will be resolved before next frame
  session.requestAnimationFrame(onFrame);
  session.requestHitTest(...); // Will be resolved after next frame?

XRRay(transform) constructor doesn't make conceptual sense

Initialize ray’s origin to { x: 0.0, y: 0.0, z: 0.0, w: 1.0 }.

Initialize ray’s direction to { x: 0.0, y: 0.0, z: -1.0, w: 0.0 }.

Transform ray’s origin by premultiplying the transform’s matrix and set ray to the result.

Transform ray’s direction by premultiplying the transform’s matrix and set ray to the result.

The origin part makes sense; this just sets the origin to transform.position, but the direction part doesn't. The direction part will basically give you the final position of (0, 0, 1) when transformed by the rigid transformation, which in our ray model is conceptually position + direction, assuming the ray we're looking for is the ray that the z axis of the XRRigidTransform's natural coordinate axes point towards.

Instead we should just rotate (0, 0, -1) by transform.orientation and call it a day.

Furthermore, Chromium isn't actually following this code:

void XRRay::Set(const TransformationMatrix& matrix,
                ExceptionState& exception_state) {
  FloatPoint3D origin = matrix.MapPoint(FloatPoint3D(0, 0, 0));
  FloatPoint3D direction = matrix.MapPoint(FloatPoint3D(0, 0, -1));
  direction.Move(-origin.X(), -origin.Y(), -origin.Z());

  Set(origin, direction, exception_state);
}

What chrome is doing is conceptually the same as "rotate by transform.orientation and call it a day"), except with extra steps, since matrix * (0, 0, -1) is equal to translation * rotation * (0, 0, -1), and they're just cancelling out the translation in the next step.

Define how the application can unsubscribe from hit test / remove hit test source

Currently, the explainer does not specify how the application could unsubscribe from the hit test.

I think we have a few possible solutions here:

  • rely on the GC to clean up the objects - if the application is no longer interested in hit test results, it can simply drop its references to XRHitTestSource
  • provide an explicit method on XRHitTestSource to signal that the application is no longer interested in obtaining the results - hit test source will be active until that happens, even if the application drops all references to its XRHitTestSource
  • provide an explicit method so that the app does not have to rely on GC, and also rely on the GC if the app drops all references to its XRHitTestSource

can HitResult have an id for the underlying geometry it hit?

For most platforms, the result of a hit test will likely be against some native plane or mesh that the system knows about.

When a hit intersects with a piece of system geometry, it would be very useful to have an id so that future hits can be tested to see if they are intersecting the same underlying object. For example, I want to use hitTest to drag a reticle with my finger, but want to restrict the reticle to stay on the same piece of geometry as when the drag started, if possible.

We could define the ID as optional (i.e., if a hit is against something that has not been added to the system world model, such as ARKit hitting against a feature point that isn't in a plane yet), and systems that don't break down the world into stable pieces, they could return a single ID (e.g., "the-one-and-only-world-mesh") at their discretion.

Support for occlusion

@kearwood wrote: "We should describe the behavior of occlusion. Should we include XRHitResult's behind other geometry such as walls detected with the sensors? Should some materials (such as windows or virtual objects) be treated as physically transparent, or perhaps included in the XRHitResult sequence? I would propose that we explicitly identify which entry in the XRHitResult sequence is the nearest physically opaque object."

Decide on async API details

Should this use a Promise to return the results?

What is the timing of the results? Can it be guaranteed that the results will be returned prior to the next RAF? That would be ideal.

Should XRHitTestResult return an XRTransform instead of an XRPose?

Apologies that this feedback is coming a bit late, but I only thought about it within the last couple of days while utilizing the API.

XRPose currently only contains a transform and a boolean to indicate that the position was based on emulated data, the latter of which doesn't seem to apply to hit tests? (I can't imagine what it would mean in this context.) In the future, the pose may be extended to include the velocity of the tracked object, which also seems irrelevant in this context. As a result, it seems like we could return just a transform and it would still provide all the necessary information with less overhead?

support a synchronous version of the call

This would dramatically improve the ergonomics of the API for developers and improve the perceived accuracy / usability of the results
Pros:

  • improved calling ergonomics
  • more accurate results in some cases, i.e. a reticle wouldn't be perpetually a frame behind

Concerns:

  • how do you handle out-of-band requests such as hit-test requests that come from controller events?
  • May be slow to jump to native and back to JS - potentially multiple milliseconds (could require batching #19 )
  • This could be challenging for all underlying implementations to support
  • The majority of the WebXR API is already async

Add feature detection for hit test capability

The spec draft does not provide any way to check for hit test support on an XR session.

One possible solution is to extend the definition of feature descriptor from the main spec to allow the apps to request a session that supports hit test. Additionally, we should probably extend XRSession as follows:

dictionary XRHitTestState {};

partial interface XRSession {
  readonly attribute XRHitTestState? hitTestState;
};

The hitTestState will be null if the session does not support hit test API or if it was not requested & is therefore unavailable.

In addition, we could specify that immersive-ar sessions must support hit test API so the request for immersive-ar session would always implicitly contain a hit test feature descriptor. immersive-vr sessions could still optionally support it (see #42).

Clarify which frame `requestAsyncHitTestResult()` should return results for

The first real-world hit testing features were added in PR immersive-web/webxr#493, including an requestAsyncHitTestResult() function. The initial design has this function exposed off of an XRFrame with the intention that the hit-test results will be relative to the frame on which they were requested.
In the PR comments and the Jan '19 F2F, there were concerns raised that this wasn't possible to implement on some platforms and, furthermore, would be out of accordance with the developer intention.
There are a number of variations to the API shape which are possible, but before we propose anything concrete, we need to get a deeper understanding of the desired behavior and platform limitations.

Does requestHitTestSource require an "immersive-ar" session?

There is a statement in the non-normative introduction that the Hit Test module "builds on top of WebXR Augmented Reality Module." However, there is no mention otherwise in the Hit Test module of any types, members or enum values defined by the AR module.

In contrast, there is pending future work (#42) to elaborate on UAs intersecting the VR bounded floor plane using this module. Beyond that, a headset having an opaque display is fairly orthogonal to it having sufficient sensor capability to detect planes/meshes. This module should support both VR and AR headsets with those real-world geometry detection capabilities.

The spec should make clear that a UA may implement the hit-test module to intersect planes and meshes for either "immersive-vr" or "immersive-ar" sessions. We should then confirm that this intro sentence still adds clarity.

requestHitTest error callback without reason

I'm trying to use the requestHitTest with a Galaxy S9 and Chrome Canary (71.0.3562.0)

I've enabled all of the necessary chrome://flags like written here and this demo works great on my device.

When calling session.requestHitTest with await I'm not getting to the next line of code and I'm getting Uncaught (in promise) undefined on console.

I've tried to get the error like that:

this.session.requestHitTest(origin,
            direction,
            frameOfRef).then(
                (data) => {console.log(data)},
                (err) => {console.log(err)}
            );

but err is undefined.

requestHitTest parameters are:

origin = Float32Array(3) [0, 0, 0]
direction = Float32Array(3) [-0.0012410288909450173, 0.009024959988892078, -0.9999585151672363]
frameOfRef = XRFrameOfReference {emulatedHeight: 0, bounds: null}

Any idea on what am I doing wrong?
Can you provide an error for the fallback?

Thank you.

Update hit-testing-explainer.md to match the current core WebXR spec

This explainer was originally written against an earlier version of the WebXR spec. There references to two attributes that no longer exists in WebXR. There may be other such references as well.

  1. XRReferenceSpace.originOffset - the surrounding text can probably be simplified since the offset cannot be changed.
  2. XRSession.viewerSpace

README looks like the explainer

Landing on this repo, it looks like the README is the extent of the explainer due to the overlapping content and .

Solutions:

  • Move explainer.md to README.md
  • Remove most of the information from README.md, and link to explainer.md.
  • Remove README.md (meh)

Add support for screen-space version of the API?

Both ARCore and ARKit currently only provide a screen-space hit-test api. This API is extremely useful for AR on smartphones, though it is irrelevant to HMD AR. Should we support that API or require the developer to convert from screen-space to a ray?

Hit Testing Virtual Objects

Issue #3 is related but not exactly the same. There we (correctly IMHO) decided that hit-test would not try to interact directly with virtual objects that are placed in the world but that hitTest would only return results of hits on real world objects detected by the system.

Applications will want to do hit tests of their virtual objects that are consistent with the hit testing the API is doing for real world objects. For example, an application may not want to display a reticle when the screen location the reticle is tracking is covered by a previously placed virtual object.

Given some of the complexity around async etc as indicated in #31 we should make sure there’s enough information/explanation for applications to be able to do hit-tests on their virtual objects that work the same as those on real world objects.

Come up with a solution to handle transient input sources

We should come up with a solution to the problem of subscribing to transient input sources.

With the current proposal of automatically creating hit test sources, UAs (Chrome and potentially others with multi-process architecture) would be forced to perform a hit test proactively, without application's involvement, in order to be able to provide the results of hit test that might then be left unused by the application. There are few problems with this approach:

  • UAs might perform hit test unnecessarily (if the application never cared for its result) every time new transient input source gets created.
  • UAs will be forced to do so for all newly transient input source instances, even though we cannot reliably predict what other kinds of transient input sources will be available in the future. The risk is that right now we have only one transient input source (touch screen input) that works well with the above approach, but the approach might break once a new type of transient input source that we didn't think about appears.
  • This approach does not allow applications to specify arbitrary rays to perform the hit test. Depending on the desired use cases, this might not be a big issue.

This was already discussed during CG calls - I've decided to create an issue to capture different approaches that we're thinking about & decide on the best one.

Related: immersive-web/webxr#819.

XRRay.direction should have a w of 1.

The direction attribute defines the ray’s 3-dimensional directional vector. The direction's w attribute MUST be 0.0 and the vector MUST be normalized to have a length of 1.0.

This doesn't make sense, if it's a vector the w should be 1, not 0, homogeneous space vectors with a w of 0 are meaningless.

Using hit test API for floor/bounds in VR mode

PR immersive-web/webxr#493 discusses "Combining virtual and real-world hit testing", the intent appears to be that WebXR API takes care of real-world understanding, while the application is responsible for virtual objects. For consistency, should real-world information provided by VR headsets also be handled by the hit test API?

Most of the reference spaces have a floor plane, and the bounded reference space adds xrReferenceSpace.boundsGeometry which is documented to be a recommended area in which the application should place content to ensure it's reachable. An application could use this information to do its own tests by treating them as virtual objects, but I think this is not very robust, and may lead to applications baking in assumptions that wouldn't be a good fit for advanced systems with better world understanding.

If we were to encourage VR applications to also use the hit test API to calculate intersections with the floor or with the bounds geometry, these applications would potentially be able to work unchanged on more advanced systems such as untethered VR headsets that work in an unbounded space.

It would still be up to the implementation to provide a safety chaperone or similar system to prevent people from running into things, but that tends to be immersion-breaking. I think encouraging use of the hit test API would be useful as a way to provide additional information about real-world geometry within VR applications, for example to show virtual objects as barriers to discourage people from getting too close to real limits.

xrReferenceSpace.boundsGeometry would still be useful as a static mechanism to provide overall bounds for the application, but it doesn't seem suitable as a dynamic mechanism, and "here's a good area to place reachable content" also seems to be a distinct use case from probing real-world boundaries. Typically I'd expect the boundsGeometry to be a simple rectangle or circle. Using it to provide detailed geometry for multiple rooms would be undesirable for privacy. It's also not available in unbounded mode, and an untethered headset with inside-out mapping may not have detailed bounds information available on application startup. The hit test API seems like a better mechanism to handle dynamic world understanding.

A user agent would be free to do hit testing just based on the floor plane and boundsGeometry (if available), including for cases where the user doesn't consent to providing more detailed world geometry to the application, so I think this would not be a large burden for implementations, but it would leave flexibility for using better data when available.

Is raycast returning real-world information only?

I'm wondering if the raycast system should return real-world information only, or if it should also return information about the virtual world which has been created.
If the hit-test was to return both, how could we differentiate real-world data from the virtual world?

For example, to place an object, the system might want to know if there has been one already placed there, to prevent the user from placing two at the same place, but also check what is the underlying real-world information to place an object underneath (like putting a carpet under a sofa for example).

API should be on XRSession and always based on the upcoming frame's world state

If we have an Asynchronous API on XRFrame, that would mean that the results would be calculated based on the world for that frame but wouldn't be available until prior to the next frame. In the case where, for example, the estimate for the height of a plane that was hit changes, the object may be placed and then rendered in the subsequent frame in a location that would be floating above or embedded in the surface.

A way to resolve this and to satisfy this line from the explainer: "hopefully it can be guaranteed that the results will be valid at least for the duration of the upcoming frame." - the hit-test should be on XRSession and would always be calculated with respect to the subsequent frame. The order of operations would be:

  1. Update the Session to produce the new frame
  2. Calculate pending hit-tests against the new frame and then resolve promises with results
  3. Invoke the RAF for this frame - any objects that were created based on the promises in (2) will be rendered in an accurate location for this frame.

interaction between hit testing and an "AR lite" mode

Over in the webxr repo there was a discussion of how to create an AR lite mode (immersive-web/webxr#402) that got moved to the proposals repo (immersive-web/proposals#29).

I just posted a comment on how we can defer strong decisions on that, if we adopt the attitude that the hit test functionality might fail for reasons other than "the platform doesn't have enough world knowledge about where you tried to hit test" (which is, implicitly, the way hit test works right now).

Specifically, if we make it clear that the UA might restrict hit testing for other reasons (e.g., user choice), we can enable different browsers to experiment with different approaches to privacy, without requiring a more explicit "AR lite" mode (yet).

Thoughts?

Support reticle functionality (continuous hit-test subscription?)

Since having a reticle in the world is a common use-case, it would be nice to have an API for creating a subscription to a particular ray or screen-space coordinate and get results every frame (possibly as part of the frame data?).

To support a controller reticle, which could be moving, maybe it needs a callback to get the current ray?

readonly properties in XRRay and GC churn?

XRRay instances may be used in calls like requestHitTest reasonably frequently (e.g. a few times every frame), and because its origin/direction properties on XRRay are readonly (DOMPointReadOnly), it doesn't seem possible to re-use an existing XRRay instance with a requestHitTest call -- You're forced to construct a new one for every call to requestHitTest. This can lead to GC churn. It's quite a common practice in webgl JS and matrix/vector libraries to be able reuse instances of vec3/matrix4/etc to save on GC churn... might it be possible to specify webxr hit test related data types to allow for this? e.g. use DOMPoint instead of DOMPointReadOnly on the XRRay (or similar)...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.