std::any::TypeId is not stable across binaries. This

This is the branch I'm working on: <a href="https://github.com/ashneverdawn/bevy/tree/

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

I created a PR ( <a class="issue-link js-issue-link" data-error-text="Failed to load t

Create and use StableTypeId instead of TypeId,about bevyengine/bevy

Comments (70)

RalfJung commented on May 16, 2024 6

Btw, since I have not seen it said above: to my knowledge, type_name is not stable, either. You cannot rely on its output to remain consistent across Rust versions or even rebuilds. The docs explicitly say (emphasis mine):

This is intended for diagnostic use. The exact contents and format of the string returned are not specified, other than being a best-effort description of the type.

So, calling something StableTypeId that is based on type_name is a misnomer.

@kazimuth

you can't directly depend on multiple versions of the same crate but your dependencies can, which could cause an issue with e.g. third-party plugins.

I am pretty sure you can, actually, with something like

[dependencies]
serde0 = { package = "serde", version = "0.1" }
serde1 = { package = "serde", version = "1.0" }

from bevy.

BERADQ commented on May 16, 2024 4

This is how the crate looks so far. It has a basic idea, but it needs more work on the implementation.
StableTypeId

from bevy.

cart commented on May 16, 2024 3

I'd love that! I think "approximately unique" is fine, especially while we are in the experimentation phase. We will soon learn if the problems stated above are actual problems in practice. At the end of the day, I think a good user experience requires a stable version of the TypeId interface. How we generate that can change, but we need to start somewhere.

I think I would prefer to keep the name a generic StableTypeId or something similar, as the "TypeId instability problem" will likely surface in other places too. This is a Rust ecosystem gap, so other non-bevy folks might want to use what we come up with.

Also definitely try to make the StableTypeId::of::<T>() a const hash of std::any::type_name::<T>(). Otherwise we're back to expensive runtime hashing.

from bevy.

kazimuth commented on May 16, 2024 2

fyi, different versions of a single crate can be included in a dependency tree, so even without plugin loading I believe you can have multiple types with the same type_name but different layouts :/ you can't directly depend on multiple versions of the same crate but your dependencies can, which could cause an issue with e.g. third-party plugins.

one solution to speed up comparison + hashing would be to have a global type name interner at run time. it's not going to be written often so it doesn't need to be particularly fancy. it wouldn't speed up serialization, though. it also doesn't solve the uniqueness issue.

you could potentially reduce the chance of conflicts by checking mem::size_of, mem::align_of, etc, and hashing those when you intern types. if you get a mismatch, panic. obviously that won't catch all issues though.

oh, alternative strategy: you could add something to the Properties derive that creates a more unique type name. like

impl Properties for $T {
    fn unique_name() -> &'static str {
         concat!(env!("CARGO_PKG_NAME"), "@", env!("CARGO_PKG_VERSION"), "::", std::module_path!(), "::$T"))
    }
}

or whatever, and then require that all types that are serialized / pass through plugins implement Property. Would just need to make sure that wouldn't slow down incremental compiles.

from bevy.

cart commented on May 16, 2024 2

I also do like adding the cargo metadata to the path to help with ambiguity (provided it doesnt prevent us from const hashing).

from bevy.

kabergstrom commented on May 16, 2024 2

For FFI, a decent approach is a similar approach as with serialization: do type ID lookups once, then save the TypeId. TypeId can be transmuted to a 64-bit integer and thus the foreign language can store TypeId as u64.

from bevy.

ashneverdawn commented on May 16, 2024 2

I'm taking a stab at this. I'm unfamiliar with the area, so if anyone has any info to share, that would be very useful.

from bevy.

ashneverdawn commented on May 16, 2024 2

This is the branch I'm working on: https://github.com/ashneverdawn/bevy/tree/fix/dynamic_plugins
I've added the dynamic plugins example back and updated it.

Build the dynamic plugin:

cd examples/app/dynamic_plugin_loading/example_plugin/
cargo build

Running the example:

cargo run --example dynamic_plugin_loading

My first goal is to get the example to run without errors. Here's the current error.
thread '<unnamed>' panicked at 'Resource does not exist bevy_asset::assets::Assets<bevy_render::mesh::mesh::Mesh>', /Users/ashneverdawn/Downloads/bevy/crates/bevy_ecs/src/resource/resources.rs:163:32

A search of "TypeId" across the repo yields 101 results in 12 files.
I'm making progress, but slow. Any assistance is greatly welcome.

from bevy.

eddyb commented on May 16, 2024 2

@eddyb I'm not sure I understand exactly your point. Are you saying that TypeId's are actually rather stable in general?

My point was that if you did see TypeIds change, there's a chance your build setup differs enough, and it should be more than enough that it also implies you can't rely on being able to dynamically load that plugin and expect it to work correctly.
(Small aside: we don't guarantee you can dynamically load Rust code at all anyway, but there are some scenarios in which it happens to work today, though it's unsound to ever unload the plugin, as safe code may leave behind &'static and fn pointers - the only guaranteed-to-work plugin systems rely on things like wasm sandboxes instead).

Also I don't think anybody here ( as far as I know ) actually ran into a case where a TypeId was different, it is just documented that the TypeIds are unstable so we were taking that into account in our design ideas.

Alright, that explains a lot. But "TypeId is not stable across binaries" is not the phrasing I'd use to describe the motivation for not using TypeId, it can be interpreted as "we've seen this happen" (which, again, is a good indication that something is broken about how you build things). You seem to want more control and future-proofing than TypeId offers, which is understandable, but "we'd rather not risk relying on TypeId" is very different from "TypeId does X, Y, Z, which we don't want".

Also, I just noticed this part in the issue description:

Its worth exploring the idea of a StableTypeId, which is just a wrapped integer. I think it makes sense to use a const-hashing algorithm on type_name to produce StableTypeIds

This is a less accurate TypeId, in that it's very similar (though we don't go through an intermediary string inside rustc) but it relies on the debugging-only type_name, which may not offer enough detail to prevent collisions before you even hash it.
If TypeIds differ at least you know something went wrong, but type_name doesn't offer that.
I suppose if you had a way to get a mangled representation of a type (which sadly you can only get today through a reverse lookup of a function in the symbol table), that would be pretty much 1:1 with TypeId.

Of course, if there is no safety element to this, just opaque IDs, it doesn't really matter.

For scripting at least we will need some alternative source of TypeId's because those IDs could come from an entirely non-Rust source.

That makes sense, if you can't get a TypeId there's no point in using them, and the only argument I've seen in favor of not using TypeId today (as opposed to worrying about future-proofing).

from bevy.

zicklag commented on May 16, 2024 2

I created a PR ( #623 ) that has a commit migrating all use of TypeId to a new ComponentId that can represent external components. It could use review and comments.

from bevy.

eddyb commented on May 16, 2024 2

Could building the "host" as an example in the bevy repo and the "dynamic plugin" as a separate cargo project that references the bevy repo cause the inconsistencies I saw?

That could cause two incompatible builds of bevy, you should check target/{debug,release}/deps/libbevy-*.rlib in both directories - if they have different hashes in the filenames, you are already introducing UB (Rust doesn't guarantee the ABI is the same, though we sadly don't do as much as we could to make it break much earlier).

But even then, two recompiles of the plugin shouldn't result in different TypeIds of types defined in the plugin itself.

from bevy.

BERADQ commented on May 16, 2024 2

I tried the Stable TypeId in my organization Kokoro,

and while it might not be ideal yet,

it worked and performed well.

from bevy.

cart commented on May 16, 2024 1

I actually wasn't thinking about the forbid_unsafe and no_std combo providing "sandboxing" capabilities. I'll need to look in to that (and the safety guarantees the combo provides). As a brief aside, its worth noting that currently Bevy has no goal to be no_std, although I'm not against that either (provided it doesn't impact the user experience).

"Sandboxed dependencies" is a nice feature thats may be worth pursuing, but Bevy's primary goals are not "running untrusted dependencies" or even "memory safety". Usability and productivity will take priority over almost everything. I don't think the concerns you have will be encountered in the "common case" (or even the top 99% of cases) and the cost of resolving them (appears to be) either major runtime overhead or major ergonomics costs.

I do want to resolve these problems, but ideally we find a way to do it that doesn't require high run time costs, macros, or user facing "boilerplate".

For example, a blunt force tool would be to add an opt-in cargo feature that uses TypeIds instead of StableTypeId. The cost there is that you wouldn't be able to use dynamic plugin loading.

Another option is exploring the "compile time dupe detection", but im afraid that breaks down in a world with dynamic dependency loading. "Run time dupe detection" seems reasonable though.

from bevy.

mrobakowski commented on May 16, 2024 1

I agree with @kazimuth. The docs for type_name specifically mention that multiple different types can have the same name and so the name cannot be considered a unique identifier. The unique_name that has been proposed above could include even more info, like which features of the crate were enabled (because afaik you can even include a single crate version multiple times with different features). There's a nice list of available env vars in the cargo book.

from bevy.

cart commented on May 16, 2024 1

I've considered type_uuid, but I have a strong preference to make things work without additional derives (especially ones that require users to generate UUIDs. theres a lot of friction there).

UUIDs are also not human readable, which means if we use them in Scenes that basically eliminates the "html-like" scene composition we're shooting for here: https://gist.github.com/cart/3e77d6537e1a0979a69de5c6749b6bcb.

The "moving components" problem is resolvable if we use "short names" by default (which we currently do) and "long names" to resolve ambiguity. To do this properly and avoid accidental name stomping we need runtime collision checks that force developers to resolve ambiguity.

I'm fine with adding optional UUID support for folks that need explicit uniqueness and are willing to deal with the overhead, but in my head the cure is worse than the disease for the common case. I will likely be pretty stubborn here, but I'll try to be as reasonable as I can be.

Imo if rust paths are suitable for importing types in the rust language (and trusting that those imports are what we want them to be), then they should be suitable for "importing" things into a game's "component namespace".

from bevy.

zicklag commented on May 16, 2024 1

Imo if rust paths are suitable for importing types in the rust language (and trusting that those imports are what we want them to be), then they should be suitable for "importing" things into a game's "component namespace".

I agree that this should most-likely be fine, but we still have the issue of type_name() not being a const fn unless we use the nighly unstable feature ( tracking issue: rust-lang/rust#63084 ). This means that we get an extra performance hit of the allocation and hashing cost every time we call StableTypeId::of unless we compile on nightly. Is this acceptable. I don't know how often we need to actually get that TypeID.

Otherwise our only option is to use a derive macro anyway. :/

@zicklag's "component generation + file" idea is very clever and worth considering,

Idea actually came from the unique-type-id crate. :)

UUIDs are also not human readable, which means if we use them in Scenes that basically eliminates the "html-like" scene composition we're shooting for here: https://gist.github.com/cart/3e77d6537e1a0979a69de5c6749b6bcb.

Not that I'm trying to push this too hard, because I agree that it would be really nice to avoid the pain of UUIDs from a generation perspective and such, but you could include the type name -> UUID mappings from the file that is generated by the proc macro with something like a use statement and you could then use those type names inside of the scene files/prefabs.

That combination while annoying and preferable to avoid may still be not too horrible if somebody did have a use-case that needed it. I'm not sure if a feature flag would be appropriate or not, but I think we would really want to make sure that there is an actual use-case for it before we go with it.

Unless we just decide that we might as well get full UUID support like that if we'd have to add a derive because of the non const fn type_name function.

from bevy.

zicklag commented on May 16, 2024 1

OK, trying to think about every kind of option we have, it looks ( somewhat obviously ) like there are two realms of possibility, compile time and runtime TypeID generation.

I think that any runtime generation requires extra hashing ( 🙁 ), almost no matter what, unless we can somehow access the internal u64 representation of the TypeId and use that to index into a vector to avoid the hash. Anyone have any ideas how we might be able to leverage that? It looks like the internal u64 for the TypeId is private and there is no way to get it out, unfortunately, but I'm not sure if there are any other ways to handle it.

Also another important question is how often does the implementation actually need to get the TypeId? If it is only so often and not in hot loops then it might not be a big concern for performance. If it is in hot code paths, though, then we probably have to worry about it.

Otherwise, the only think I can think of in the compile time generation realm is to require registering components that you want to use with a macro, which I think is worth that using a derive macro: register_components!(MyComponent) or multiple at a time register_components!(MyComponent1, MyComponent2).

My feeling in the matter right now is that a derive would be worth it. So I totally understand your aversion to it @cart and I'm totally on board with the reasons not to do it, but consider the fact that almost every developer will probably want to derive at least Debug on their structs. All of the sudden, the extra cost of a derive goes from adding these characters, #[derive(Uuid)] ( 15 characters ), to , Uuid ( 6 characters ), granted this cost is per struct. Still, in the context of a Rust game engine, where most people in the community will value high performance, I think a large percentage of them would gladly add 6 characters to the top of their structs to get better performance out of the engine.

Additionally, you could try to find the shortest name possible for the derive macro to save as much as possible too. Maybe:

Tid ( for type ID )
Id
Sid ( for stable ID )

from bevy.

zicklag commented on May 16, 2024 1

For FFI, a decent approach is a similar approach as with serialization: do type ID lookups once, then save the TypeId. TypeId can be transmuted to a 64-bit integer and thus the foreign language can store TypeId as u64.

Interesting... Let me see if I've got this right:

So when the ECS needs to have a TypeId that it uses to reference a component type it could be an enum:

enum ComponentId {
  TypeId(TypeId),
  ExternalId(u64),
}

All internal types would use the generated Rust TypeID and the external types would get their ID's mapped to an External(u64) variant.

Edit: This would be how the ECS thinks about it. Then whenever type ids need come out of the ECS, it uses the mappings of type_name or UUID if available.

Wouldn't deriving the TypeId from the type's name mean that internal refactors might result in save-file incompatibility?

Not if we used a mapping from the TypeId to the type_name before serializing it.

I think this could work.

from bevy.

zicklag commented on May 16, 2024 1

Oh, cool @ashneverdawn. I might get to checking out what you're doing soon-ish, but I've got to finish some other work off first. 👍

from bevy.

ashneverdawn commented on May 16, 2024 1

It's been a while since I looked at this, but last I tried there were parts of bevy that would fail to dynamically link. Someone pointed me to this issue, so I thought TypeId may have been the issue. If I'm reading this right, it seems like maybe it wasn't.

Anyhow, the more I think about this, the more I feel like going the wasm route for dynamic loading as well as for scripting would be the wiser path.

from bevy.

cart commented on May 16, 2024 1

I'll dig into this in the near future / fully read all of these conversations, but here is what I've done so far in this space and what lead me to believe TypeIds were "unstable across binaries":

Add "dynamic plugin loading" to Bevy using libloading
Notice that systems aren't working as expected. ECS queries defined in the "host process", which rely on TypeId to look up type storages, were not returning entities created by the "dynamically loaded plugin", despite the fact that the rust types matched. Observe that these types are hashed differently.
Swap out TypeId for a custom "stable type id" that just contains the type_name internally.
Notice queries behave as expected.
After the move to Bevy ECS, I assumed we would need to solve the problem in a similar way.

From what @eddyb is saying, it sounds like the behavior from (2) was because of some "mismatch" in build configuration. If we can find a config that makes them stable across binaries, that would certainly be ideal. Given that bevy already supports dynamic plugins, this shouldn't be to hard to test.

I really appreciate that you took the time to help us out @eddyb! You've given me hope that we can have this feature without a ton of engineering work / tradeoffs.

from bevy.

eddyb commented on May 16, 2024 1

@cart Alright, so that's different from @zicklag's claim that there was never any mismatch observed.

Where do the types come from? The plugin itself, or some dependency shared between it and the host process?
Can you check that values like TypeId::of::<u8>() and TypeId::of::<String>() match between the two?
(if they don't, you're using different Rust versions, which is definitely not compatible)

After that, my above suggestion remains: if the TypeIds are different, then symbol names should too, in a similar way (especially in debug mode, TypeId::of itself will have a mangled symbol name that depends on the type in the same way TypeId does), and RUSTFLAGS=-Zsymbol-mangling-version=v0 should let you pinpoint where the difference is.

from bevy.

bjorn3 commented on May 16, 2024 1

Note that if you want to create a StableTypeId for scene files and networking, you should still keep using std::any::TypeId for dynamic plugins as this is the only way to ensure that the layout hasn't changed in a way that would cause memory unsafety.

from bevy.

Fidius-jko commented on May 16, 2024 1

Hello, an off-topic question, why did github automatically mention my issue? When did I add the comment there?

from bevy.

cart commented on May 16, 2024

This is even more important now that we're using bevy_ecs instead of legion. Dynamic plugin loading will be broken until we add this.

from bevy.

Plecra commented on May 16, 2024

How would a hashing algorithm deal with collisions? Obvious question, but I don't see how this issue can be addressed without resolving it 😶

from bevy.

cart commented on May 16, 2024

If you are discussing hash collisions where "a" and "b" hash to the same thing, i think we can choose one that is effectively "collision free" for the type names we would be using. The world runs on hashing strings.

For the case where there are two types that have the same "type name": I don't think thats a problem in practice. Within a crate there cannot be collisions. Across crates, crates.io enforces uniqueness.

I'd love your thoughts on this!

from bevy.

Plecra commented on May 16, 2024

I'm concerned about unsafe code making assumptions that any given TypeId must only have one representation, since I can't tell how that'd be avoided.

I think Rust's safety is best thought of as a binary property, and wouldn't like a solution that just made it really unlikely that you manage to write two types with the same hash (which could then proceed to cause UB). Especially as the algorithm would be public, and crates.io is an amazing attack vector.

That said, I could be completely unaware of some way of proving an algorithm "collision free" for type names, or there really is a way of verifying the hashes. ~~(On second thought, can the StableTypeId just contain a pre-hashed &'static str? Yea, probably...)~~ Ahw, same issue with the serialization

from bevy.

cart commented on May 16, 2024

Back when bevy used a legion fork, i solved the type id problem with &'static str. It definitely worked, but there are two issues:

Performance. Comparison (and hashing) is expensive and both are relatively common operations in the ECS implementation
We'd still be hashing the string because aspects of hecs rely on a hashed type id. The only difference between &'static str and StableTypeId is where the hash happens (run time vs compile time).

I think the cost/benefit here is quite low. The cost is "very very very unlikely collisions" and the benefit is dynamic plugin loading and no additional runtime cost. On the "malicious code" side, if you are taking a dependency on something, that is already trusted code. We will never be able to protect against malicious dependencies.

from bevy.

Plecra commented on May 16, 2024

What do you mean by that? There's a lot that can be done to defend from dangerous code, and Rust is an effort to empower developers in that sense.

As far as I'm aware, there shouldn't be any way for a forbid_unsafe, no_std crate to interact with the host machine, or mess with any memory I don't give it. That's incredibly useful for managing dependencies in a Foss context.

A soundness hole allows those otherwise sandboxed dependencies to trigger UB, and get up to whatever they like.

Instead, could this be done with a derive macro? The uniqueness of the IDs could then be verified at compile time, and compilation can fail if a collision is found. I do understand why you don't consider a priority - the chances are incredibly low. It's just a disappointing sacrifice to make (Side note: this flaw would mean that properly reviewing dependencies for safety would require a manual verification step anyway since a human can't know that a name is safe for their project)

from bevy.

Plecra commented on May 16, 2024

It'd be a shame to have to use some kind of boilerplate. The ease of use of components at the moment is just brilliant.

And given a decent hashing algorithm, I'd be surprised if this issue ever actually mattered. However much I like the idea, you're probably right that it isn't worth it.

Could this decision be added to the documentation if a better solution isn't found? I think it's only fair to users to make the choice visible, since it does contradict the technical meaning of the safe API. Hopefully it'd also give traction to an RFC for a real StableTypeId 😁

from bevy.

zicklag commented on May 16, 2024

Relevant to this discussion is #142. In order to have dynamically registered components, TypeId would need to become more of a ComponentId and would have to be suitable for components created outside of the Rust ecosystem as well.

from bevy.

zicklag commented on May 16, 2024

I opened a forum topic to find out if there is a way, without using specialization, to automatically implement a Component trait using the TypeId approach ( which I think could be adjusted to use const hash function and the type_name or something like that ) for any Rust type, but allow manually implementing it for things like scripted components. I'm trying to reduce the boilerplate as much as possible while allowing for custom component ids.

Unfortunately I think its the exact definition of the use-case for specialization: Any way to do this without specialization?.

from bevy.

cart commented on May 16, 2024

I'm thinking we could do something like this (pseudo-code):

StableTypeId(u64)

StableTypeId::of::<T>() == HashOf("actual::type::name::of::T")

let custom_type_id = StableTypeId::new(rand::<u64>())

from bevy.

zicklag commented on May 16, 2024

Oh, OK. That works. 😄 👍

I think that's easy, then. What do you think of renaming it ComponentId? That is what it is really for right? To uniquely identify components types, which with the advent of dynamic components are not necessarily synonymous with Rust types.

Also I think it is a good idea to add some of the Cargo env vars to the type_name like @kazimuth suggested. It won't solve all problems associated to collisions, but I think it will do a very close job of it.

I'll try to implement this.

from bevy.

Plecra commented on May 16, 2024

So btw, using the cargo metadata will require a procedural macro (since env::var is not a const fn).

Since this will force StableTypeId::of to be evaluated in the procedural macro anyway, we can have the macro check for collisions and have a properly bullet proof implementation. Which would effectively mean the only reason to use a "real" TypeId API is to avoid the impossibly rare collisions that this implementation could encounter

from bevy.

zicklag commented on May 16, 2024

So btw, using the cargo metadata will require a procedural macro (since env::var is not a const fn).

Actually we can just use the env!() macro which grabs environment variables at compile time.

we can have the macro check for collisions

How would the macro check for collisions?

I suppose if the procedural macro would log every id to a file as it was compiling hashing out the ids and then checked to make sure every id that it hashed was not one that was previously hashed maybe? I think I'm going to start just by keeping it simple and using env!() and a const fn. If we want to add an extremely bulletproof solution later with a proc macro we could, maybe putting it behind a feature flag because it would probably increase ( cold? ) compile times.

from bevy.

zicklag commented on May 16, 2024

Oh, well, unfortunately, type_name() is not stable as a const fn yet, it's an unstable nightly feature, which throws a wrench in the whole compile time type name hashing idea. The only thing I can think of then is to use a proc macro, but that requires you to add a derive to all your components which is not cool. Not horrible, but not cool. 😕

Any ideas?

from bevy.

kabergstrom commented on May 16, 2024

Have you considered a UUID as type ID as per https://docs.rs/type-uuid/0.1.2/type_uuid/ ?
Especially when it comes to persisted data, it can be rather troublesome to use a type's name as ID, since it prohibits moving or renaming the type. I can see a hash of the name working OK for the dynamic plugin and networking use-cases, but it may require some more care for persisted data.

from bevy.

zicklag commented on May 16, 2024

Yeah, I'm thinking that to get all the features we want we are just going to need to require a #[derive(Component)] for all components. It's kind of nice for demos, examples, tutorials, etc. to not need it, but I can't see a way around it if we want solid StableTypId's that will not conflict.

Question: do we need to change the TypeId every time there is a change to its layout? That makes it even more inconvenient. Otherwise I could just have the proc macro generate a uuid and save it to a file in the CARGO_DIR with a mapping of all of the types to their uuids.

At this point it seems like the type_uuid crate approach is the most effective, again, from a feature standpoint. We just need to come up with every way possible we can make it as convenient as we can.

Maybe something like this:

#[derive(Component)]
#[component(generation = 1)]
struct Mycomponent;

When you derive component the proc macro will create or update a file in you CARGO_DIR called type_ids.toml. It will map the type_name() to a uuid that it generates every time the component generation is changed. If you change the component's layout you then just have to update the generation and it will be sure to generate you a new type ID for the component.

from bevy.

cart commented on May 16, 2024

@zicklag's "component generation + file" idea is very clever and worth considering, but I'm still biased toward making paths work.

from bevy.

cart commented on May 16, 2024

Question: do we need to change the TypeId every time there is a change to its layout? That makes it even more inconvenient. Otherwise I could just have the proc macro generate a uuid and save it to a file in the CARGO_DIR with a mapping of all of the types to their uuids.

I see this as a non-goal for now. For things that need that additional information, we can track that separately.

from bevy.

cart commented on May 16, 2024

Arg thats on me for thinking type_name was already const (i was probably conflating it with TypeId::of::<T>() which is currently const in the rust beta, so it should hit stable soon). Sorry for sending you down the wrong path there. If type_name ever goes const I think we should revisit the issue, but I guess for now we need to make some hard choices.

Lets brainstorm as many approaches as we can.

Ex: How about a run-time correlation between type ids from other binaries? Basically a HashSet<TypeId, TypeUuid>. At registration time we could correlate TypeIds via their type_names, which would only need to happen once. The benefit here is that the public interface doesnt need to change and we have a "generic TypeId stability solution" that could be applied anywhere. The downside here is we would add yet another hash to any api that needs a TypeId.

from bevy.

kabergstrom commented on May 16, 2024

I think it's a bit unnecessary to replace TypeId with StableTypeId literally everywhere. For serialization, as long as you have a type registry, you only need to do the mapping between a StableTypeId and TypeId when you deserialize something, and back again when serializing. And on that note, it should be possible to offer multiple StableTypeId mappings to a TypeId:

UUID opt-in, if the TypeUuid trait is implemented
type_name (potentially ambiguous, but no extra effort needed for users)
Name aliases (opt-in through trait implementation perhaps)

As for the dynamic plugin scenario, is performance really that important here that type_name is not acceptable? Dynamic plugins are only intended for use in development, right?

from bevy.

zicklag commented on May 16, 2024

We do need a StableTypeId or something even in production for supporting scripted components ( #142 ).

In the case of having scripted components with IDs that must be generated outside of Rust, I think TypeIds will not work and we will need to use an alternative such as StableTypeId everywhere, but I could be missing something.

from bevy.

soruh commented on May 16, 2024

Wouldn't deriving the TypeId from the type's name mean that internal refactors might result in save-file incompatibility?

from bevy.

kabergstrom commented on May 16, 2024

Note that, for FFI, if you want to store types that are defined in the foreign language (which is fine - hecs/legion both use untyped storage internals), you will need a separate ID space from TypeId too. For example, (TypeId, u64) where the u32 is an arbitrary integer which the foreign language can use to store its type ID.

from bevy.

zicklag commented on May 16, 2024

Would the ComponentId::ExternalId(u64) enum not allow you to store IDs for components defined in foreign languages? Essentially the external languages would generate a random u64 for the ids for its own components and track that however they determine, and then it would feed those IDs to hecs and hecs would store those in a ComponentId::ExternalId. For all internally defined Rust types which already have a Rust TypeId it would use ComponentId::TypeId variants.

which is fine - hecs/legion both use untyped storage internals

Sigh... 😌 Yay, hopefully that will make the storage easier. I'm going to be trying to implement whatever's necessary for custom storage and I don't know much of this stuff, so I'm glad that won't require super major changes. 🙂

from bevy.

ashneverdawn commented on May 16, 2024

Is there a branch/fork for this yet?

from bevy.

eddyb commented on May 16, 2024

I'm really interested in how you're seeing different TypeIds, because that usually means there's no guarantees the ABI matches.
We are sadly extremely conservative (to the point where we let a lot of things appear compatible, where we probably shouldn't) in the interest of incremental recompilation (there may be more we could do though LLVM's limitations here have us a bit cornered).

My only guess is you're compiling the same type definition in two different crates (from the perspective of Cargo), or building the same crate in two different ways (that Cargo considers incompatible).

This will result in non-interoperable code, and the only reason it might seem otherwise is that we're really bad at introducing variation where we know for sure we don't want to allow interoperation (there's ideas like rust-lang/rust#38550 (comment) but they have to be implemented in order to "shake things loose").

("we" above refers to the Rust compiler team, I guess, though some of these things might be my own opinions)

from bevy.

zicklag commented on May 16, 2024

@eddyb I'm not sure I understand exactly your point. Are you saying that TypeId's are actually rather stable in general?

Even if they are almost stable, it isn't something we want to depend on unless it is like a development-only dependency such as dynamically loading modules ( assuming that is something you only do in development ).

For scripting at least we will need some alternative source of TypeId's because those IDs could come from an entirely non-Rust source.

Also I don't think anybody here ( as far as I know ) actually ran into a case where a TypeId was different, it is just documented that the TypeIds are unstable so we were taking that into account in our design ideas.

from bevy.

eddyb commented on May 16, 2024

TypeId shouldn't be able to cause dynamic linking errors. What I would suggest for debugging stuff like that is to build everything (including the program the plugin is loaded into - otherwise they won't be compatible) with RUSTFLAGS=-Zsymbol-mangling-version=v0, that way you can more easily tell what's wrong, when you e.g. compare nm outputs.

To demangle those symbols, cargo install rustfilt and then pipe/pass them to rustfilt -h (the -h will make it show crate disambiguator hashes, which you can use to tell whether you accidentally built an incompatible copy of some crate).

from bevy.

cart commented on May 16, 2024

First, just as a heads up: I'm still focused on other things (asset system stuff), so I've tabled the the followup TypeId stability work here until thats finished (hopefully within the next week). I like to focus on one thing without distractions and managing this project already introduces way more distractions than I'd prefer, so for everything else I have to choose my battles 😄

@eddyb

Where do the types come from? The plugin itself, or some dependency shared between it and the host process?
Can you check that values like TypeId::of::() and TypeId::of::() match between the two?
(if they don't, you're using different Rust versions, which is definitely not compatible)

Its a bit of both. The setup is like this:

host process (main.rs that depends on the bevy crate to do the following):
- runs the app itself (windowing / rendering / etc)
- registers/runs built in statically loaded "plugins" (this is just a trait impl)
- loads/registers dynamically loaded "plugins" via libloading
- runs the entity-component-system schedule
  - this is where TypeId comes in. Bevy plugin logic is implemented as ECS systems. The ECS uses TypeIds to look up component storages for use in systems. Some components are registered statically (and therefore use types from the host binary) and some are registered dynamically (and therefore use types from the dynamic plugin lib)
dynamic plugin (lib.rs that also depends on the bevy crate)
- registers its own systems / TypeId storages using a pointer to an App Schedule passed in by the host process.

@RalfJung

I am definitely aware of the limitations of type_name, but (1) in practice they dont change often / at all (2) this is a "same-machine same-compiler" feature to improve developer iteration times. Coming from the mindset of "TypeIds are completely unstable across binaries", I think StableTypeId is a reasonable fit for the context. But something like TypeNameId seems like a reasonable alternative that could avoid misunderstandings.

We do currently use derivations of type_name for "human readable" type identifiers in our scene format. By default we make it as short as possible (ex: path::to::Type becomes Type), then we use long names to resolve ambiguity. I understand that the rust team doesn't want to commit to a "type_name" spec, but in practice I think the "short names" are trustable and I am not of aware of any better tools available to us that we get "for free". Forcing people to derive TypeUuid for all of their components isn't my preference, nor is making them manually define type names. I want to rely on the existing rust type names and my users shouldn't need to re-define them given that the compiler/std already know them.

from bevy.

RalfJung commented on May 16, 2024

I am definitely aware of the limitations of type_name, but (1) in practice they dont change often / at all (2) this is a "same-machine same-compiler" feature to improve developer iteration times. Coming from the mindset of "TypeIds are completely unstable across binaries", I think StableTypeId is a reasonable fit for the context. But something like TypeNameId seems like a reasonable alternative that could avoid misunderstandings.

Besides relying on "de-facto stability for same-machine same-compiler" (not an unreasonable assumption even if not sanctioned by the spec), are you not also relying on different types having different type_name? That is not the case; there can be multiple crates with the same name and AFAIK then the types in them can have the same type_name as well.

from bevy.

cart commented on May 16, 2024

@RalfJung

Yup we are (currently) relying on this. If there is ever a direct collision, we can throw an error. In practice, I'm not particularly worried about crate name collisions, especially given that most plugins will be published to crates.io. I understand that this may feel "bad" and "not rigorous", especially coming from a Rust-ey mindset. And for generic solutions to this problem I would agree. But if you start with the notion of "we need short type names for human readable scene files", then we have already introduced collision issues.

If we instead had people implement trait TypeName { fn get_short_name() -> &str } manually for each type, we would still have collision issues.

The only rigorous solution that I know of is to use UUIDs everywhere, which is not a tradeoff I ~~do not~~ want to make.

from bevy.

eddyb commented on May 16, 2024

If you are rebuilding the same Cargo project to obtain a newer build of the plugin, and this is a feature meant primarily for development uses, please use TypeId. If it breaks, that's great, because it means you're doing something wrong all of a sudden.

It's a very weak test for basic things like "did I upgrade my Rust version and forgot to recompile the host binary" (if you're not linking bevy as a dylib, you will get duplication of everything between the host and the plugin, so things will appear to work even when they shouldn't - and I guess even if you do have a common dylib it will also probably succeed dynamic loading even if it's an incompatible version).

At the very least, if you move to something like "language-agnostic ComponentId", consider checking, on plugin loading, that things like TypeId::of::<bevy::SomeType>() agree between the host and the plugin. If they don't, do not run the plugin.
This is the least you can do to make sure you're not accidentally loading incompatible code.

from bevy.

RalfJung commented on May 16, 2024

In practice, I'm not particularly worried about crate name collisions, especially given that most plugins will be published to crates.io.

I'd expect crate name collisions to mostly arise because the crate tree contains two distinct versions of the same crates.io package -- but probably that is also unlikely in your case.

from bevy.

zicklag commented on May 16, 2024

I like to focus on one thing without distractions and managing this project already introduces way more distractions than I'd prefer, so for everything else I have to choose my battles smile

I totally understand, the amount of notifications that this repo is triggering is insane. 😅 Get back to this whenever you can, no problem. 👍

To give more context on my PR:

I know a lot of the point of this discussion is to enable dynamically loaded plugins using Rust dylibs with the expressed purpose of only being used for development and for improving compile times, but the other reason some alternative form of TypeId is needed is for scripted systems which have no Rust TypeId regardless of its stability.

My PR is supposed to remedy that problem of facilitating external component IDs.

Currently my ComponentId is defined as an enum that is either a RustTypeId(TypeId) or an ExternalId(u64), but if we decided to use something like a TypeNameId to handle the dynamic Rust plugin loading, we could just change RustTypeId(TypeId) to RustTypeId(TypeNameId) I think.

from bevy.

cart commented on May 16, 2024

@eddyb

Thats all very good to know. I'm certainly inclined to try to make TypeId work at this point for in-memory type identifiers. Could building the "host" as an example in the bevy repo and the "dynamic plugin" as a separate cargo project that references the bevy repo cause the inconsistencies I saw?

@RalfJung

Yeah that makes sense. Double-registering two versions of a component in the Bevy component registry (which is required if you want to read/write a component in scenes) is something we'd want to prevent anyway, so I think that issue is largely mitigated.

@zicklag

Yeah "production scripting/plugins" will definitely need to rely on something like ExternalId. At first glance your PR seems like the right approach.

from bevy.

CAD97 commented on May 16, 2024

Since I don't see it linked already, the Rust issue on TypeId collisions: rust-lang/rust#10389

Further discussion of possible collision-free schemes are discussed there, as well as a report of someone running into a TypeId collision in real code.

from bevy.

alercah commented on May 16, 2024

I believe that different Cargo projects could easily differ by using different versions of dependencies, given how Cargo.lock works. The Cargo.lock in any dependency is ignored when building. Care would need to be taken to ensure that the dependencies are identical between the loading binary and the loaded library.

from bevy.

alice-i-cecile commented on May 16, 2024

@cart, from what I understand @MinerSebas believes this is resolved for 0.5 following ECS v2. Does this match your understanding?

from bevy.

MinerSebas commented on May 16, 2024

My reasoning for closing this Issue comes from this Part of the 0.5 Blogpost:

Preparation for Scripting Support
[...] Components also no longer require Rust TypeIds.

from bevy.

cart commented on May 16, 2024

Thats related, but not directly to this. The issue here is that in the past we've encountered "mismatched type ids" for the same type when dynamically loading plugins. The way I solved this in pre-release versions of Bevy (that used legion) was to create "stable type ids" derived from type names and use them instead of TypeId.

We've had experts tell us in this thread that this instability isn't because type ids are inherently unstable across binaries, but because the binaries were compiled with mismatched configuration. I'd say we haven't actually resolved this until we have dynamic plugin loading working again. I haven't tried in awhile, so its possible that we have already fixed the "config mismatch" issue. If it still doesn't work, we'll need to find the source of the incompatibility.

from bevy.

bjorn3 commented on May 16, 2024

The type id is calculated at https://github.com/rust-lang/rust/blob/a6e7a5aa5dc842ddb64607040aefb6ec33e22b59/compiler/rustc_middle/src/ty/util.rs#L154-L171 This should result in the same type id for as long as the type definition is (recursively) identical and the types crate it is defined in has the same StableCrateHash which as the name implies is meant to be stable across different compilations of the same crate. It only depends on the crate name and the disambiguator passed using -Cmetadata by cargo. This disambiguator depends on things like the profile you are compiling with, the rustc version and the package name and version, but not on the actual source code.

If you include!() the same file in multiple binaries, the type id will be different, but if you define the types in a shared crate, it should stay constant. Using it for scene files and networking is not a good idea as the type id depends on the compiler version and can be influenced by #[cfg] inside the standard library. Using it for dynamic plugins should be fine.

from bevy.

cart commented on May 16, 2024

Awesome that makes sense to me.

from bevy.

HindrikStegenga commented on May 16, 2024

In my personal ECS implementation I worked around this by implementing a const fnv1a hash function, which defaults to using std::any::typename. Obviously, this requires #[const_type_name] but this is not an issue for me as I require nightly for now regardless. Secondly, I allow for overriding on a type level (in case of a collision) for each component type in my Component trait. For me this approach works really well, even across library boundaries.

from bevy.

Fidius-jko commented on May 16, 2024

How do you like the idea of creating a TypeId not only from the name and path, but also from fields?

from bevy.

BERADQ commented on May 16, 2024

How do you like the idea of creating a TypeId not only from the name and path, but also from fields?

Yes, that is the implementation of stable typeid.

from bevy.

Fidius-jko commented on May 16, 2024

How do you like the idea of storing not only the TypeId but also the TypeId type in the StableTypeId? Because if someone wants to use bevy in Java, then the question arises: how to synchronize the TypeId from Java and the TypeId from Rust so that there are as few conflicts as possible, and just It would be possible to indicate which TypeId is used, for Rust, for example, 0, and for Java 1 (or whichever user wants)?

pub struct StableTypeId(/*TypeId*/, TypeOfTypeId)

from bevy.

Create and use StableTypeId instead of TypeId about bevy HOT 70 OPEN

Comments (70)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent