Comments (17)
Hi, I am UBsan, and I'm currently complaining about this.
from wasm-c-api.
There's a straight-forward fix:
- make the constructors defaulted and protected
- make the destructor protected
- remove the overloaded operator delete
- add a private destroy() method
- change
own<T>
fromunique_ptr<T>
tounique_ptr<T, Deleter>
where Deleter calls destroy() - make the classes friends with their Deleters so that unique_ptr can delete them but
Foo::make()->destroy()
doesn't compile (if it did, it would lead to a double-free).
Then implementers can:
- create their own implementation classes that public-derive from these classes and add their own additional member variables and methods
- define the declared make() method to always create objects of the matching derived type.
- define the other methods by static_cast'ing the (this) pointer to the derived type which is always valid since the only way to create one is with the make() method that always creates the derived type anyways
- like the other methods destroy() casts to derived type, then calls "delete casted_this;" or calls a derived destroy() which does "delete this;"
Example:
wasm.hh has:
template <typename T>
struct Destroyer {
void operator()(T *ptr) {
ptr->destroy();
}
};
template<class T> using own = std::unique_ptr<T, Destroyer<T>>;
template<class T>
auto make_own(T* x) -> own<T> { return own<T>(x); }
class WASM_API_EXTERN Config {
friend struct Destroyer<Config>;
void destroy();
protected:
Config() = default;
~Config();
public:
static auto make() -> own<Config>;
// Implementations may provide custom methods for manipulating Configs.
};
and impl-wasm.cc has:
class ImplConfig : public Config {
public:
ImplConfig() = default;
~ImplConfig() {}
static ImplConfig *from(Config *config) {
return static_cast<ImplConfig *>(config);
}
private:
int impl_state;
};
Config::~Config() {}
auto Config::make() -> own<Config> { return make_own<Config>(new ImplConfig); }
void Config::destroy() { delete ImplConfig::from(this); }
A few comments on this approach:
-
A
unique_ptr<Derived, Deleter<Derived>>
doesn't freely cast to aunique_ptr<Base, Deleter<Base>>
. I think the easiest fix it to pass the explicit T to make_own when doing a make + downcast combined as shown in my definition of make() above. -
With this approach the potential bug of deleting the object of base type is prevented:
void test(Config *config) {
delete config;
}
dt.cc:52:10: error: calling a protected destructor of class 'Config'
delete config;
^
dt.cc:20:3: note: declared protected here
~Config();
^
which is necessary if we expect the implementer to extend the size of the object when deriving.
- Manual memory management is still possible but obvious in code review.
void ok1() {
auto x = Config::make();
}
void ok2() {
auto x = Config::make().release();
Destroyer<Config>()(x);
}
void error1() {
Config::make()->destroy();
}
dt.cc:63:19: error: 'destroy' is a private member of 'Config'
Config::make()->destroy();
^
dt.cc:19:8: note: implicitly declared private here
void destroy();
^
The other main alternative is to use virtual
. I assume we're avoiding that?
from wasm-c-api.
I've mostly implemented it in the draft PR #161. There's a pre-existing bug with Shared<> that I'm not sure what approach to take to address and am looking for input. wasm-v8.cc defines an explicit template specialization for Shared<Module>::~Shared()
, which is a problem because no specialization was declared and the destructor (or now destroy()
in my approach) may be implicitly instantiated. I know this aspect of C++ templates is not well understood so I'll give a primer:
- implicit instantiation. The template is defined generally over some template parameters and you use that pattern. This is what happens when you use
std::vector<char>
for instance. For implicit instantiations, multiple definitions is not a linker error and it is optional whether the compiler chooses to emit the the symbols to the .o file (for example, it may depend on the optimization level whether a function template is inlined into all callers so now there is no need for a symbol -- this is what leads to most cases of mysterious C++ template linking errors). - explicit template instantiation definition. Emits a definition to the .o file for all the symbols in this template. Two definitions may or may not be a linker error, but the definition must be present. For example (
template Shared<int>;
). - explicit template instantiation declaration. Added in C++11. Informs the compiler that the template is expected to be provided by another .o file, in conjunction with clang's -Wundefined-var-template and -Wundefined-func-template to prevent those mysterious linking errors. (
extern template Shared<int>;
). - explicit template specialization definition. Replace the body of a template pattern (a whole class or a function or variable) with a different one for a particular set of template arguments. (
template<> struct Shared<int> { /* ... */ };
) - explicit template specialization declaration. Declare that there exists an explicit specialization for this particular set of template arguments. (
template<> struct Shared<int>;
)
(not discussed: partial templates or template templates)
It's incorrect to call an explicit specialization when expecting an instantiation per https://eel.is/c++draft/temp.expl.spec#7.sentence-1 . So this combination:
wasm.hh:
class WASM_API_EXTERN Shared {
public:
Shared() = delete;
~Shared();
void operator delete(void*);
};
wasm-v8.cc:
template<>
Shared<Module>::~Shared() {
stats.free(Stats::MODULE, this, Stats::SHARED);
impl(this)->~vec();
}
is a non-starter. (If it helps, you can imagine that an explicit specialization has a different type signature, so it could be mangled differently, though both popular ABIs chose to mangle them the same which is why it still links.) Any code that might trigger implicit instantiation of the declaration of d'tor/destroy() must know that this is defined with an explicit specialization.
How would we do that? We could forbid C++ implementations from using an explicit specialization. We could tell users that it's normal to have to include a second vendor-specific header file even though the entire interface defined in wasm.hh is portable at the source level. We could declare all the explicit specializations in wasm.hh and require all implementations to define for each specialization? We could replace Shared<T>
with a non-template?
from wasm-c-api.
I believe that this pattern is fine, because the objects are not accessed through a pointer of the respective API type: the implementations of all functions cast the opaque API pointer back to its real type first. (I could be wrong, but that's my reading of the spec and the code. FWIW, UBSan is currently not complaining.)
from wasm-c-api.
UBSan is currently not complaining.
Absence of evidence is not evidence of absence.
See also class.mfct.non-static/2:
If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.
from wasm-c-api.
Do folks have any suggestion how to achieve implementation independence here without using such a cast or making all methods static? (Though, practically speaking, V8 itself relies on patterns like this so fundamentally that I doubt the ones here ever matter.)
from wasm-c-api.
Hi, I am UBsan, and I'm currently complaining about this.
Well played :-)
Absence of evidence is not evidence of absence.
I know, and in particular UBSan only provides partial coverage. Hence "FWIW".
If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.
TIL. Well, for whatever that's worth, this is a common pattern for libraries, so it's unlikely that compilers will be able to afford to break it, but it's good to know that this actually is UB. In that case, I would support moving to a different design, however I don't have a good suggestion for what might be such a design.
from wasm-c-api.
@rossberg do you have a link to where this is done in V8? Would be interested in having a look!
I'll have another look to tomorrow and see if I can think of another possibility which maintains the performance characteristics while being safer standards-wise and bit easier to work with, unless someone beats me to it 😄.
from wasm-c-api.
@TartanLlama, just look at the implementation of the factory class. Pretty much every allocation function there goes through some of various general mechanisms for allocating on the GC'ed heap and then casts the result to one of the gazillion object classes. With everything going on in a real-world GC, I have a hard time imagining a sane alternative to this approach. And there probably are many far worse uses of what's technically UB all over the place.
Would be a fun challenge to write a production VM in C++ without any UB. I would make any bet that that's outright impossible. :)
from wasm-c-api.
@rossberg : V8's heap objects actually don't use this pattern any more :-)
But V8's API still uses it, and AFAIK it's a fairly common pattern for libraries to have public types that are essentially opaque pointers that get reinterpret_casted to their internal equivalents. The benefits are that (1) the internal classes' implementations are hidden from the public API, and (2) you can use idiomatic member function calls on the external types. Unless I'm missing something, using inheritance would break (1) for non-trivial class hierarchies (maybe unless one uses multiple inheritance?), while switching to static functions would obviously break (2).
Maybe something could be built using composition with a forward-declared pointer to the internal object, roughly:
//// public.h
namespace internal {
class Internal;
}
class Public {
public:
void foo();
private:
internal::Internal* impl_;
};
//// internal.cc
namespace internal {
class Internal {
public:
foo() { /* real implementation */ }
private:
int field_;
};
} // namespace internal
Public::foo() { impl_->foo(); }
But I haven't really thought through the implications of that yet, e.g. regarding possible overhead, construction/destruction/lifetime management issues, or any non-obvious corner cases.
Also, having to forward-declare a bunch of internal classes might be fine for a project-specific API like V8's, but seems unfortunate for an implementation-independent API like the wasm-c-api. So I'm not arguing in favor of this particular approach, just putting the idea out there.
Would be a fun challenge to write a production VM in C++ without any UB. I would make any bet that that's outright impossible.
My understanding is that strictly speaking it's even impossible to implement hello-world in C++ without relying on UB, because one has to assume that any program requires a non-zero amount of stack space, which might not be available in some environments, and behavior on stack overflow is undefined.
from wasm-c-api.
This comment serves as something of a footnote. As far as I know, the existing header file can be implemented without UB in C++17. You just have to construct an object with only a single deleted constructor.
No problem.
class WASM_API_EXTERN WasmerConfig : public Config {
public:
WasmerConfig() : Config{} {}
static WasmerConfig *from(void *base) {
return reinterpret_cast<WasmerConfig *>(base);
}
private:
// TODO: custom config state
};
Config::~Config() {}
void Config::operator delete(void *ptr) { delete WasmerConfig::from(ptr); }
auto Config::make() -> own<Config> { return make_own(new WasmerConfig); }
// TODO: add custom config calls
Note that the base object initialization used Config{}
and not Config()
because uniform initialization syntax treats Config as an aggregate and we bypass the deleted constructor by directly initializing its elements (all zero of them).
This is no longer possible in C++20 because classes with constructors (including deleted constructors) are not considered aggregates for the purposes of uniform initialization.
from wasm-c-api.
@nlewycky, thanks for that! The only downside I see is a further increase of the boilerplate in the header file's class definitions, but this is C++, so who cares.
I assume an empty deleter struct will not increase the size of the unique_ptr object? (The current C implementation on top of the C++ API does some really verboten reinterpret casts between unique_ptr<T>*
and T**
to avoid copying vectors. =:-} )
from wasm-c-api.
@rossberg Strictly speaking there's no guarantee about the size or representation of a unique_ptr in any case, but I'd be surprised if you find any production-quality C++ standard library where a unique_ptr with an empty deleter is not simply the pointer. If you were happy with reinterpret_cast
before, you should be just as happy afterwards.
Related: https://stackoverflow.com/questions/13460395/how-can-stdunique-ptr-have-no-size-overhead
from wasm-c-api.
Can I interest you in creating a PR for making this change? (including adapting the wasm-v8.cc prototype?) :)
from wasm-c-api.
Thanks a lot for the help, I'll have a look at your PR soon.
As for Shared, IIRC my intention was that the template specialisation for each sharable type (currently, only Module) be declared in wasm.hh, since the implementation is always going to be type-specific. I just forgot doing that, and the toolchain did not complain. Do you foresee a problem with that path? (The reason for using a template is so that there is a natural, uniform way for writing shared types.)
from wasm-c-api.
I think there might be an issue with that, but we can defer it to another issue if there is. Making a specialization of a struct means that you replace the entire innards of the struct.
template<typename T> class Shared { int x; };
template<> class Shared<char> { using x = char; };
There's no need for the specialization's contents to have any relationship with the template pattern, so the API shown in wasm.hh isn't necessarily the API that the specialization will have. Offhand I'm not sure if there's any special-case rules that come into play if the specialization is declared but not defined and a template pattern is provided, etc.
from wasm-c-api.
Ah, I may have been using inaccurate terminology above. I meant that the class template specialisation for Shared<Module>
is defined in the header, but its methods are merely declared.
My understanding is that a class template specialization that is declared but not defined yields an incomplete type, so is only useful as a forward declaration.
from wasm-c-api.
Related Issues (20)
- Embedding multiple engines implementing wasm C API HOT 3
- undefined reference to `wasm::vec<char>::free_data()'
- Syslog HOT 2
- Custom Sections API HOT 4
- WASM proposals supporting in Wasm C API HOT 1
- API to Terminate Execution for a wasm::Store
- Expectations for serialize/deserialize HOT 4
- make error:error: use of undeclared identifier 'errc' struct is_error_condition_enum<errc> HOT 1
- docker build fails in C++ Multi test HOT 2
- AddressSanitizer: allocation-size-too-big HOT 3
- Access of v8 isolate possible? HOT 1
- Deprecate wasm-c-api on big-endian hosts HOT 12
- Support import globals callback HOT 6
- posix shared memory with wamr
- Metering API
- `vec<T>::make_nt` may not be so helpful HOT 1
- Please tag when used in wabt
- memory64 support
- Type constructors should have a `wasm_engine_t*` parameter HOT 10
- How to import a native func to "env"? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wasm-c-api.