webassembly / tool-conventions Goto Github PK

Conventions supporting interoperatibility between tools working with WebAssembly.

License: Artistic License 2.0

tool-conventions's Introduction

WebAssembly Tool Conventions

This repository holds documents describing conventions useful for coordinating interoperability between wasm-related tools. This includes descriptions of intermediate file formats, conventions for mapping high-level language types, names, and abstraction features to WebAssembly types, identifiers, and implementations, and schemes for supporting debuggers or other tools.

These conventions are not part of the WebAssembly standard, and are not required of WebAssembly-consuming implementations to execute WebAssembly code. Tools producing and working with WebAssembly in other ways also need not follow any of these conventions. They exist only to support tools that wish to interoperate with other tools at a higher abstraction level than just WebAssembly itself.

These conventions are also not exclusive. There could be multiple conventions for a given language for a given purpose. There are natural benefits to interoperability, but there are many reasons where having more than one way to do things can also make sense in many circumstances.

tool-conventions's People

Contributors

Stargazers

Watchers

tool-conventions's Issues

Reloc target should probably be required to be the same as reloc entry

We currently don't require the 5-byte patchable ULEB in code and elsewhere to contain anything since all information will be overwritten by the linker based on the reloc entry. By convention we put an index there that matches the current file, but this is not required.

Tools like wasm-validate ignore the linking section however, and will report validation errors if the ULEB is always 0.

I propose that either:

We state that a patchable ULEB that does not correspond to its reloc is invalid, and maybe even enforce so in LLD or other consumers.
If we really don't care what is in here, we should make tools like wasm-validate always take the linking section into account and ignore the ULEB.
Don't store the value twice, i.e. remove it from the reloc entry :)

I'm guessing 1. is most practical.

Also noticed that at least for some paths, LLVM is currently hard-coding a zero for these ULEBs, I may look into fixing that: https://github.com/llvm-mirror/llvm/blob/master/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCCodeEmitter.cpp#L166

@sbc100 @binji

Encoding data segment alignment and BSS size

In order for a linker to lay out static data from different object it needs to know the alignment of each object data segment(s).

We also need to know the total static data size (including bss).

In the current s2wasm world the latter is encoded via the "staticBump" metadata field which s2wasm produces. The former is not a problem since llmv-link does this part.

So I think we need to store this information in each relocatable wasm object. The question is then how should we store it? We could create specially named global "__data_alignment" and "__data_size", or we could add these as entries in the linking metadata section? I'm leaning towards the latter, especially since down the line we may end up with many data segments per object (-fdata-sections).

Using only .functype directive for both defined and declared functions

In .s assembly, currently we use .param and .result directives to denote defined function types, as in

.param i32, i64
.result i32

And .functype directive to denote declared (= undefined) function types, as in

.functype foo, void, i32, i64

(The first type is the return type)

@sbc100 suggested that we unify these to just .functype directive, so we can use the single .functype directive for both defined and declared functions. We currently don't support parsing of .functype, but after we do, I think this is a cleaner option. What do you think?

@dschuff @sunfishcode @sbc100 @aardappel

Import attribute

Hello,

As there is a reflection about a dedicated export attribute in #64, I wondered if there would be any interest for an import attribute?

With a string argument, it may allow imports from multiple modules/host-namespaces, lifting the restriction on --allow-undefined-file to import only from "env".
It would make it more convenient for embeders (to specify their APIs) and their users (to compile against it).

// env.h
__attribute__((wasm_import("env"))) void print_i32(int32_t);
__attribute__((wasm_import("env"))) void print_i64(int64_t);
...

// test.c
#include <env.h>
__attribute__((wasm_export)) void hello() { print_i32(789); }

Thanks,
JB.

Supporting custom `global` directives

I've been doing some toying around with seeing what the wasm threads proposal would look like for Rust recently, and along the way I think that one feature that'd be great to support is the ability to define custom global directives in one way or another.

AFAIK a global basically acts like a "thread local", being tied to a particular instance instead of a global WebAssembly.Module. This works great, for example, for the stack pointer that LLVM uses (which is stored in a global).

Some other uses of global, though, could be something like:

Storing an address for thread-local storage values. For example whenever a thread starts it could store the top of the stack into a global which is then used to store auxiliary information like the thread ID, thread-local-store values, etc.
Storing the ID of a thread, making it very cheap to access.
Any other form of thread local value (so long as it fits in i32)

Currently LLVM doesn't have support for creating custom global items, nor does I believe LLD support linking custom object files which use items like a custom global. What would be the best convention for tools like LLVM and/or LLD to support something like this?

Should support for thread locals be added to LLVM and be implemented with global?
Could a custom hand-written *.wat file be compiled/assembled and linked with LLD? (this means access can't be inlined but it can at least be written!)
Should LLVM support a direct API (maybe with metadata?) for defining global directives?

I'm curious if others have thoughts on this!

Anyref toolchain story?

I don't think we have a full plan for anyref yet. One issue is how to implement it in LLVM - do we need a new LLVM IR type? There are also questions about how source code for using it would be written in source langues like C, C++, and Rust. Opening this issue for more discussion on this topic.

The use case I'm most familiar with is the glue code in emscripten, like the WebGL glue: Compiled C does a glDrawArrays or other GL call, which goes into the JS glue which holds on to WebGL JS objects like the context, textures, etc., and it does the WebGL call using those, after mapping the C texture index (an integer) into the JS object, etc. In that use case, I don't think we have immediate plans to use anyref - wasm+anyref can't do all the stuff the current JS glue does (like, say, subarray-ing a Typed Array).

But for glue code that could be done in wasm (which eventually should be all of it, but that may take a while), I'm not sure we necessarily need clang and LLVM support. It would be nice, but if it's hard, another option might be to write such code in AssemblyScript or another close-to-wasm language. It's easy and natural to express anyrefs there. Then that would be compiled to wasm and linked to the LLVM output.

Curious to hear of more use cases, and whether there is a more immediate goal for using anyrefs in emscripten, LLVM, clang, etc. (for binaryen and wabt, there is the obvious immediate goal of having full anyrefs support).

cc @Keno @wingo @dcodeIO @aardappel

Linking.md: Document builtin symbol names

We have various builtin symbols names which we use in the object format and in the final output executable which should be documents in Linking.md.

They should also be consistent. See #55

custom section that documents producing toolchain?

In a few contexts now, it has seemed useful to have a standardized custom section that declares the toolchain that produced the .wasm module. For example, browsers could use such a section to measure the rate of usage of various toolchains and languages via telemetry to help understand usage trends on the web. WDYT?

Document expected import/export behavior for tools

Today, my compiler successfully emitted a working wasm file with the help of the LLVM back-end and the wasm-ld linker. Figuring out how to do so proved to be a frustrating and time-consuming challenge in large part due to breaking LLVM changes and a lot of undocumented behavior around how the tools handle import and export sections. Finding helpful answers required gathering together stray data cast around in obscure websites (and running across a lot of advice that is now wrong) and having to read in detail the LLVM source code used to generate and link wasm modules.

So ... I recommend that this repo contain a document on expected import/export behavior across tools. Some of this information should also be selectively disseminated elsewhere, where appropriate. Here are some information I believe would be useful to cover:

How LLVM (both the backend & linker) can decide by default what names to identify as exported and what to be imported. At some point between LLVM v5 and v7, the LLVM backend was neutered to be unable to generate a usable wasm file on its own: it generates no export section and hardcodes only two imports (for memory and table) with loader-unusable names. Not only is this generated wasm unusable, it forces the developer to manually enumerate what to export/import as part of the linker step, a mistake-prone process not required when creating object files and executables for native OS. In addition, the linker strips off the memory/table import and assumes they are to be exported instead.

Why would it not make more sense for wasm files generated by a compiler to be workable as is (without requiring a linker step) and let the compiled programs specify what to import/export based on the language's visibility attribute (a concept baked into LLVM) or alternatively based on the DLLimport/export (a concept also baked into LLVM)? If the linker is given wasm files with existing import/export sections, it need not strip or override them, but rather merge them. If no import/export info exists, or it needs to be overridden, the linker's many options currently supported can be used to add or change these settings. But it feels like the linker has less information available to make default decisions than the compiler does, so let the compiler lead if it chooses to.

Clearly document the linker's import/export manipulation options somewhere and describe explicitly what they do: --export, --export-dynamic, --export-table, --import-table, --export-memory, --import-memory, ---allow-undefined-file (!).
Establish a consistent default behavior re: --import-memory vs. --export-memory. Personally, I would select --import-memory as the default because it offers more flexibility in sharing and sizing memory across wasm modules. But whichever is chosen, there should be an agreement on the module name ('env'?) and the name of memory ('memory' vs. '__linear_memory'). Ditto for the table, which will play a larger role evidently in dynamic loading of wasm modules.

Note as well that the LLVM backend generates text-based wat files whose import/export sections don't match what is generated in the binary wasm file, making diagnosing problems harder because you don't expect to have to use the wabt tools to see what was really generated.

The Javascript documentation describing the instantiation also should be beefed up, to describe clearly how differently to handle when memory is imported vs. exported (and what that means), and that when imported, the import module should be named 'env' and the memory is expected to be called 'memory'. Ditto for the table's conventions.

I have no experience with the enscripten toolchain and backward compatibility issues, which no doubt complicate these decisions. I suspect I have gotten some stuff wrong (sorry). My intent here is to help make it easier for those that will come along afterwards. Perhaps other compilers (e.g., Rust or Zig, e.g., ziglang/zig#1570) might also have valuable feedback on these standards before what people do solidifies too much more, making it impossible to corral in.

Exception handling proposal link is dead

Linking.md function relocations

I haven't looked at globals yet, but here's what I've found looking at the function relocations in Linking.md and implementing them in LLVM MC:

W need a relocation code for (non-leb128) function addresses in the data section.
For function addresses used in i32.const and the data section, the index is the function table index, not the function index.
R_FUNCTION_INDEX_[S]LEB relocations need an index field too, to identify a function.

Consider using indices rather than names for symbol identifiers

A syminfo currently identifies its symbol by name. It seems like it would be simpler, more consistent with wasm encoding in general, and consistent with the ELF-like principle of not referring to symbols within the same module by name, and possibly faster to link, if it identified symbols by index.

It would also work well with #29, in case it became important to add symbol flags to local symbols, because local symbols have indices.

It would need some way to indicate the type of index: function, global, table, or memory. I expect we could either split WASM_SYMBOL_INFO into 4 codes, WASM_FUNCTION_SYMBOL_INFO and so on, or we could add external_kind bytes.

dylink naming conventions, camelCase vs. under_score

I should probably apologize in advance for bikeshedding this. 😞

I'm curious if it's too late to consider having a consistent naming convention for dynamic linking. We have __post_instantiate using an under_score separator and prefix, and memoryBase, tableBase with camelCase no prefix. Do these have any other historical significance? Just emscripten?

I don't think we'll ever agree on which is better, so I'm wondering if we compromise and instead use all lowercase, double under_score prefix. The lowercase form feels less opinionated and more low-level, although I can't exactly quantify why I feel that way.

__postinstantiate
__memorybase
__tablebase

Function-level linking

Are there any plans to support function-level linking? If so, how? Would relocatable webassembly modules be allowed to contain multiple data and code sections?

Export of local symbols in .o files

This C:

static void foo(void) {}
void *bar(void) { return foo; }

currently compilers to this wasm:

(module
  (type (;0;) (func (result i32)))
  (type (;1;) (func))
  (func $bar (type 0) (result i32)
    i32.const 0)
  (func $foo (type 1))
  (table (;0;) 1 anyfunc)
  (memory (;0;) 0)
  (export "bar" (func $bar))
  (export "foo" (func $foo))
  (elem (i32.const 0) $foo))

It's surprising to see "~~bar~~foo" exported here, since it has internal name linkage. The linkage is recorded in the metadata, via WASM_SYM_BINDING_LOCAL, however it seems like it would be more wasm-like to just omit the export.

Local symbols are never comdat or weak and they don't have visibilities, so it seems like they could be omitted from the WASM_SYMBOL_INFO subsection altogether.

Atomic fence support

C++ 11 provides std::atomic_signal_fence and std::atomic_thread_fence, and LLVM IR's fence instruction also takes a syncscope argument, which can be either "singlethread" (for a single thread, such as signal handlers) or "<target-scope>" (for multiple threads). And there are people who use asm volatile(""::: "memory") with a similar usage to fences. How do we support these fences in wasm? Some discussions have been going on in this CL, but the CL itself was not very much related to the topic.

`std::atomic_thread_fence` / LLVM's `fence syncscope("<target-scope>")`

Wasm does not have a fence instruction, and currently all atomic memory instructions are sequentially consistent, but we still need something because fences also order non-atomic instructions. I submitted a CL that converts a fence into a sequence of

get_global $__stack_pointer
i32.atomic.rmw.or 0

which is basically an idempotent atomic RMW instruction.

`std::atomic_signal_fence` / LLVM's `fence syncscope("singlethread")`

Currently the fence CL does not distinguish syncscopes, treating both of them the same way conservatively. But for fences that work only within a thread, we may not need to emit an atomic idempotent RMW instruction; preventing instruction reordering across it during the backend compilation (using a pseudo instruction or something) would be sufficient, and it can eventually be lowered down to 0 instruction. What do you think?

`asm volatile(""::: "memory")`

We are currently doing nothing for this, but how about treating this as the same way as a signal fence, meaning it will become a pseudo instruction that prevents reordering within the compiler backend but eventually be converted to 0 instruction?

C++ volatile support

Recently we had a few discussions on how to support C++ volatile. Some discussion thread started in this CL (the CL itself was not terribly relevant though; the discussions went in the comments below) and the CG meeting on July 24. While the discussion in the CG meeting was very informative, the topic is not directly relevant with the wasm spec itself, so maybe this repo is a better place for future discussions.

Summary of discussions so far:

LLVM's -fms-volatile option turns volatiles into atomics, and this option is the default for MSVC, so there is a precedent for turning volatiles into atomic operations by default in LLVM.
asm.js fastcomp also currently converts volatiles to atomics.
@jfbastien suggested we should be user friendly by converting volatiles into atomics. For larger sizes not supported naturally by wasm, we can even tear them, because volatiles can tear. If we turn volatile load/stores into regular load stores, the volatile's guarantee that every byte should be touched only once per instruction is not satisfied anymore.
Some people expressed concerns that by converting volatiles into sequentially consistent atomics, we are making it too strong, and if we support relaxed memory models in future, code that used to work might break.
@sunfishcode suggested we'd better be consistent with other targets, and people who what to use that feature should turn on the option.

Object files shouldn't include a NAME section

The NAME section of the object files currently duplicates information that is already stored in the import/export/linking section which is bad for several reasons:

Its redundant and wastes space
The information can get out of sync (which is the canonical name for a function?)

We already have a change out to remove the NAME section from the "symbol table":
https://reviews.llvm.org/D42075

But I propose we take it one step further and simply don't emit a NAME section when writing the object file.

Linking.md: How should the linker handle non-"env" imports

I've got a change out that adds initial support for imports from custom modules: https://reviews.llvm.org/D45796

However I think we should probably handle them a little differently going forward.

Perhaps only "env" (or "__extern" or whatever we call the default) imports should be added to
the symbol table. Maybe imports should be not be resolved at static link time at all?

Linking ABI: import the memory and table?

Currently, clang emits code that imports the stack pointer, but doesn't import the memory or table. It doesn't export the ones it defines either. Clearly the linker can still do what it needs, but it would be nice to have some consistency here. Is it best to import the memory and table, or do something else?

Should we make relocations specific to data segments and individual functions

By default in the toolchain we are running with -ffunction-sections and -fdata-sections. In order to achieve this the linker must be able to selectively include or exclude a given data segment or an individual function.

However, relocations are specified on the per-section basis, which means that figuring out which relocations apply to given segment or function is not easy (currently all relocations are scanned for each segment/function).

Rather than one reloc section per input section (i.e. reloc.CODE + reloc.DATA) should we include a reloc section for each segment/function. e.g. reloc.CODE.1, reloc.CODE.2, reloc.DATA.1, etc...

Should the order of custom sections be specified?

Currently, the Linking.md spec defines two new custom sections, "linking" and "reloc.*". However, their order is not specified with relation to each other or other sections.

Currently, the order of all the standard Wasm sections and the "name" section is precisely specified, with no flexibility.

I believe that whenever new Wasm custom sections are speced, their order should be included in the spec. This will ensure that sections are created in one and only one order, ensuring compatibility between tools. If two different specs define their own custom sections, there is the open question of how those specs would reference each other to define the relative ordering of their new sections. However, we aren't there yet - all I'm concerned about right now is whether the new Linking.md sections should have their order specified relative to the existing standard sections.

The sections should have this order:

Section id/name	Rationale
...	...
10	CODE section: core spec
11	DATA section: core spec
`"linking"`	Linking.md: MUST come after DATA since the number of segments must be known in order to validate the symbol table. This custom section must be unique.
`"reloc.*"`	Linking.md: MUST come after `"linking"` since the symbol table must be known in order to validate the reloc indexes. If there is more than one reloc section, they way appear in any order.
`"name"`	Core spec says this MUST come after DATA. We could leave its order relative to linking/reloc unspecified, or could nail it down to come after them (preferable to me, to ensure compat).

"Merging" of global sections

I think that phrase is somewhat ambiguous:

Data segments and table sections need to be merged by concatenation+renumbering.

The code section can be merged that way, too, though it's not clear to me whether a linker should, should be allowed to, or shouldn't collapse identical function bodies to the same function index (but not the same table index).

But for globals, doesn't it depend on the global whether the right thing to do is to concatenate the global entries (such as for unrelated globals) or make them refer to the same combined global (such as for the stack pointer)? In C terms, the same global variable in different CUs should be made to refer to the same memory location/wasm global, right?

Even if the approach we choose is to concatenate global entries in general, I don't see anything specific about the stack pointer that justifies giving it, but no other global, a special section to indicate it should be collapsed.

So the specific proposal is to replace "merging" by "concatenation" or "collapsing" in Linking.md; the less specific proposal is to get rid of the stack pointer special case and replace it with a generic way of merging "common" globals by name, rather than id.

Threaded modules need to execute `data.drop` on all threads

In the description of the passive segments portion of linking it mentions that __wasm_init_memory will be used to initialize all memory segments. This is presumably called on the first thread, and LLD today also executes data.drop for each memory segment.

I think, though, that all new threads also need to execute data.drop for all memory segments to avoid keeping them around, right? Should data.drop not be part of __wasm_init_memory? Or perhaps another synthetic function to drop segments?

FWIW we've had a strategy of doing this in wasm-bindgen prior to LLVM 9 which involved injecting a start function which did an atomic add to initialize a thread id counter, and based off the thread ID it'd initialize memory or drop segments. I wonder if perhaps most modules need something like that anyway to get synthesized during LLD as well?

linking: Proposal for new visibility attribute "exported"

Right now we support two different visibility options: "hidden" and "default".

"hidden" symbols are not exported unless explicitly exported with --export.
"default" symbols are exported either via -export-dynamic or individually via --export (Or by default if building a DLL. We currently don't support DLLs but that is the idea).

In emscripten we need a way to implement the EMSCRIPTEN_KEEP_ALIVE macros that is more precise than the current __attribute__ ((visibility ("default"))). This currently means we are passing -export-dynamic, which has the side effect of exports all default visibility symbols (i.e. most of libc and libcxx) which means these symbols don't get GCd.

Basically we want stronger idea exporting, something like __attribute__ ((visibility ("exported"))) which tell the linker to always export a certain symbol.

Use wasm start-function linking and __cxa_atexit

.init_array/.fini_array and .ctors/.dtors have a history of being exploitable. For example, "Abusing .CTORS and .DTORS For FUN and PROFIT", and other examples are easy to find. Modern ELF systems have mitigated it with RELRO which makes these sections read-only. However, WebAssembly doesn't support read-only memory. This may change in the future (or perhaps we could use a separate linear memory space), though at present there are no proposals. If we implement traditional .init_array/.fini_array support now, we'd be opening up an attack vector.

WebAssembly would be less vulnerable than traditional ELF sysystems without RELRO, because wasm's indirect calls can only call into defined function entry points, and type signatures have to match, however theoretical exploits are still possible.

I propose to avoid .init_array and .fini_array, and instead:

use wasm start functions for initialization code, and add support for linking them, rather than use .init_array, as suggested here.
lower destructor code, such as __attribute__((destructor)), into functions registered with __cxa_atext by initializers. I've implemented this in LLVM here.

To be sure, with current __cxa_atexit implementations, function pointers are still stored in writeable linear memory, so the problem already exists. However, .init_array/.fini_array are dense arrays of pointers, making them easier to hit, and they're more likely to live at a predictable address.

__cxa_atexit is also generally more robust than .fini_array because

If cleanups are registered dynamically while global ctors are being run, it properly orders all the cleanups, including the dynamically registered ones, in the reverse order.
If the process exits before all the global constructors have run, it doesn't run destructors that haven't been registered yet.

This approach also doesn't preclude implementing .init_array and .fini_array in the future. We could always add support for .fini_array in the tools without breaking the ABI.

The main downside of this approach is that it's different from how other platforms work, and would require more adaptation when porting libc and other low-level tools. However, I believe it's worth the effort.

Referencing symbols accross the DLL boundary

The current state of dynamic linking is described https://github.com/WebAssembly/tool-conventions/blob/master/DynamicLinking.md.

However this doesn't detail how imported symbols can referenced. Looking at the current implementation in emscripten it seems that all references to external symbols current go through JS functions:

Function Imports

Imported functions point JS thunks, which start empty and get updates as that various DLL are loaded. This works a little bit like PLT written in JS. Because all the complexity is moved JS all such functions are called with the regular call instruction and codegen doesn't need to know if a function is internal or external to the DLL.

Address Imports

For address that are external the current module, getters are used to retrieve the absolute symbol address. These are JS functions of the form g$<symbol_name> which return the absolute address of the given dynamic symbol. This means the codegen for -fPIC code needs to treat internal at external symbols very differently.

internal: get_global $memory_base + add known offset of <symbol_name>.
external call g$<symbol_name>.

The circular nature of imports and exports between shared libraries means that we can't directly import functions from other modules. However, I think we can do better than forcing all references to external functions and address to go via JS.

I'm proposing the following scheme:

Function Imports

All functions imported from other DLLs go via call_indirect. The loader is responsible for allocating table slots for every function. This makes the table act as out PLT. Functions are imported as immutable globals that represent the table offset so each external call would looks like:

get_global $foo
call_indirect

Address Imports

We can use mutable globals for this. The loader can allocate one mutable global for each data address. A load from external address would then look like:

get_global $bar
i32.load

Mutable globals are needed because the addresses all global won't be known until all DLLs have been loaded, and which point the loader and update the mutable globals.

Its possible that only code compiled with -fPIC would need this extra indirection. We could treat the main executable as unique and always load it last, at which point all data address and wasm function will be available to the loader and external functions can be imported directly, avoiding the call_indirect.

Does lld reserve table slot 0 for a null function?

Binaryen's wasm linker inserts a null function in slot 0 in the table (WebAssembly/binaryen#658) to make null function pointer comparisons possible (WebAssembly/spec#312).

What do we ideally want the assembly (LLVM .s) text format to look like?

We just landed a first version of the assembly parser in LLVM https://reviews.llvm.org/D44329. This version is quite basic and adheres closely to what the disassembler had been outputting so far.

Now that we have both the assembler and dissassembler though, we can decide to make changes to this format that better suit future needs.

In particular, we currently have 2 flavors of the .s format:

Elf (-triple=wasm32-unknown-unknown-elf).
- This is the format that is currently consumed by s2wasm, so any changes to the format would also have to be made there. Since in the long run the toolchain will be all-binary by default, at which point s2wasm will maybe not be needed anymore, it may not be wise to invest too much in changing this format.
- This path currently defaults to using -disable-wasm-explicit-locals, i.e. it may have $0 as an operand to refer to local 0, instead of a preceding getlocal 0 instruction.
Wasm (-triple=wasm32-unknown-unknown-wasm).
- This corresponds to the wasm-specific .o format that gets directly consumed by lld. This is the path we are working towards the toolchain taking. Once that path is default, the function of the .s format is mostly intended for:
  - Inline assembly.
  - Writing LLVM tests.
  - Viewing .o file contents.
  - Writing .o files by hand? :)
- This format is intended to be used with explicit locals, as this matches the .wat format more closely.
- This format currently has no-one depending on it (afaik), so we can change this more easily to suit the above needs by just changing the assembler and disassembler together.
- This triple doesn't currently work since it requires an implementation of a part of LLVM that we only have for ELF (since it is also used by other "CPUs"), which we may add next. The parser itself already deals with all variants above.

The current disassembler outputs pseudo stack registers to make wasm look more like a CPU/register machine, and maybe make it easier for humans to track the stack, for example:

i32.const $push0=, 1
i32.const $push1=, 2
i32.add $push2=, $pop0, $pop1

This however is also verbose, so for the use of inline assembly in particular, it may be nice to allow people to write pure stack code. Or even better, we can decide that for the wasm mode above, we require that these stack operands not be present.

The LLVM table-gen assembly matcher used in the assembler currently requires there to be operands, so even if we ignore or disallow these operands in the .s format, they will temporarily be generated on the fly. The current implementation can't do this yet, and instead relies on them to be present (and correctly numbered).

In general, I think we want the .s format be as close as possible to the .wat format, while still fitting the LLVM mold.

Anything else we should change about the .s format while we're at it? Am I missing something?

Import attribute

Hello,

As there is a reflection about a dedicated export attribute in #64, I wondered if there would be any interest for an import attribute?

With a string argument, it may allow imports from multiple modules/host-namespaces, lifting the restriction on --allow-undefined-file to import only from "env".
It would make it more convenient for embeders to specify their API, and users to compile against it.

// env.h
__attribute__((wasm_import("env"))) void print_i32(int32_t);
__attribute__((wasm_import("env"))) void print_i64(int64_t);
...

// test.c
#include <env.h>
__attribute__((wasm_export)) void hello() { print_i32(789); }

Thanks,
JB.

Dynamic linking and stack pointer

DynamicLinking only describes how to load and initialize the loaded module's memory and table.

However, I wonder what about the stack pointer? I think there is a good reasons for reusing the same stack area for all linked modules.

TLS data doesn't indicate an alignment

Currently the description of thread local storage allows specifying the size of the TLS section to the runtime to allocate, but it doesn't specify the alignment with which to allocate it. While a 4 or 8-byte alignment is probably good enough for most programs, I know that in Rust we've historically run into issues where SIMD things got into LTO builds which required a bigger alignment (like 16 bytes or so).

Would it be possible for a global like __tls_align be synthesized next to __tls_size to indicate to the runtime at what alignment it needs to be allocated at?

cc @quantum5

Text format for relocation information

@sbc100 and I were just talking about how we might represent some of the relocation information in the WebAssembly text format.

Nothing fully formed yet, but some ideas:

It would be useful to be able to name data segments, then use these in functions, e.g.:

(func
  (i32.load (i32.const $foo)) ...)  ;; This would remap to 10
(data $foo (i32.const 10) "...")

This is a nice feature for the text format, but probably not required for relocations since we also need to be able to define imported memory offset symbols that can only be resolved at link time.

Since memory offset symbols are defined as globals, perhaps we should have some way to annotate these to specify a relocation. The current documentation says that these will be retrieved with get_global, but @sbc100 says that this is out-of-date. Instead, a i32.const 0 (with a 5-byte immediate) is written which will be replaced later by the linker. Perhaps this could be written in the text format using a new sigil for relocations, @:

(import "some" "symbol" (global @extsym i32))
(global @intsym i32 (i32.const 0)) ;; 0 is unused, will be resolved by the linker

(func
  (i32.load (i32.const 0) offset=@intsym))  ;; uses R_WEBASSEMBLY_MEMORY_ADDR_LEB 
  (i32.load (i32.const @intsym))  ;; uses R_WEBASSEMBLY_MEMORY_ADDR_SLEB
  (i32.load (i32.const @extsym))  ;; same
  ...)

;; Data segments now can have names, but only for relocations.
(data @intsym (i32.const 10) ...)  

;; We need a way to store the address of a global in memory as well.
(data (i32.const 100) @extsym)  ;; uses R_WEBASSEMBLY_MEMORY_ADDR_I32

Keeping the global definition is nice because that is what will be generated in the binary format. It's kinda weird to have to define the global and use it later for the segment, though (and has issues with the name being used twice as well). Thoughts on improvements?

Should we define a scalar intrinsics header?

@Maratyszcza suggested to me that we should have a portable intrinsics header for non-SIMD WebAssembly instructions that are not directly expressible in C, so I wanted to gauge wider interest in such a header. On hand there is only one production C/C++ compiler for WebAssembly at the moment so portability isn't so important, but on the other hand it may be important in the future and having proper intrinsics is certainly more ergonomic than writing __builtin_wasm_* everwhere. What do people think?

There is precedent for scalar intrinsics on other platforms, for example arm_acle.h documented in http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf.

Some instructions such a header would include would be i32.clz, i32.ctz, i32.popcnt, i32.shr_s, i32.rotl, i32.rotr, f32.min, f32.max, etc.

syscall ABI

IIUC we currently don't have a stable syscall ABI. I think we should try to standardize something.

Having a stable ABI we agree on means that embedder don't have to roll their own. There's plenty of experience to gain from what Emscripten did, and I would love to have its JavaScript syscall layer as a free-standing thing.

Here's a quick sketch:

Each syscall is its own function (unlike e.g. Linux where each syscall signature is a function, taking as first parameter the syscall number).
Go through the existing syscalls and adopt ones we deem useful.
Embedder can do X when a syscall cannot work on that platform (X TBD, should we allow trapping if e.g. sockets aren't available?).
The module for all syscalls is the same. Say syscall.
I think we might want to version the module name (i.e. syscall_v0): adding new syscalls doesn't need a new version, but changing any tool-convention ABI behavior would require bumping the version. Unless we think behavior will never change, in which case no versioning.
The field for each syscall is just the syscall number macro's name (e.g. exit, fork, read, write, open, ...)
IIRC we've talked about adding a custom clang attribute to denote module / field of an export / import.

One open question I have: say a JS embedding wants to let the user choose how to implement filesystem access (maybe WebSQL versus in-memory are two options). How would be offer a stable ABI, and let users choose which JS glue to use? They can't just change the "filesystem" import if all syscalls are in the "syscall" import. Should we group syscalls by theme, and are all of these orthogonal enough that you wouldn't want to have two in the same group sometimes?

Frontend predefines

I had a user ask about Emscripten's predefines (specifically it was surprising that __asmjs__ was predefined instead of __wasm__). For that particular case, since already Emscripten knows that we are compiling straight to wasm, it should probably also define __wasm__, and more generally we should probably make the defines match between asm and wasm just as we are doing with the rest of the ABI. I think we've discussed this before informally but probably worth getting down here and maybe putting something in the doc. Obviously Emscripten can still define extra things like __EMSCRIPTEN__ outside of the respective backends where it makes sense.

Poking through https://github.com/llvm-mirror/clang/blob/master/test/Preprocessor/init.c#L9063 I don't really see much interesting aside from __wasm__ that's not just a consequence of the other ABI stuff we've discussed separately. We did remove __unix__ previously. Are we missing anything?

Linking function pointers and function relocations must use Symbol not Function index

This arises from https://bugs.llvm.org/show_bug.cgi?id=35625

First Problem - function pointers

Currently, there is no linking metadata for table entries (function pointers). The Wasm table entry itself is used to find out which function the relocation should be generated for. In order for the Wasm module to pass validation, the table entries in object files must point to a valid Function (code definition).

However, this actually semantically wrong. The linker can't generate correct output simply using the Function index, it needs the actual Symbol.

Here's an example to illustrate:

=== file1.c ===
int aliasFn(void) { return 1; }

=== file2.c ===
int directFn(void) { return 2; }
extern int aliasFn(void) __attribute__((weak,alias("directFn")));
int directFnPtr() { return (int)&directFn; }
int aliasFnPtr() { return (int)&aliasFn; }

int callDirect(void) { return directFn(); }

=== Results ===
Compile and link with `lld file1.o file2.o`.
 * directFnPtr and aliasFnPtr should point to two different methods, but
   unfortunately they don't.
 * If file2 is compiled with -O0 (to prevent inlining) then callDirect erroneously
   returns 1

(The output Wasm module for this has three functions: directFn, directFnPtr, aliasFnPtr. The table entries can only point to those. It has four exports (=symbols): directFn, aliasFn, directFnPtr, aliasFnPtr.)

The relocations for directFnPtr and aliasFnPtr will both reference Wasm table entries - and those table entries both point to directFn! The relocations are completely indistinguishable... yet, upon linking, we may need those two function to be relocated to point to different functions (if another file provides "aliasFn" more strongly or earlier).

Second problem - function relocations

Even worse, function relocations are also broken by aliases! Oops. In the example above, LLD happens to use the Symbol for "aliasFn" as the Symbol associated with the body of "directFn". That's actually completely reasonable behaviour - we have here two different Symbols/names associated with the same function body, it's a toss of a coin which one is chosen. There is no single "canonical" Symbol referring to the Function body.

Attempts to call "directFn" directly (not even via a pointer!) will also generate incorrect code currently. This is because the last-defined Symbol for a function wins, and so the relocation is generated against aliasFn.

In the linking metadata for relocations, we can't use the Function index, we have to use the Symbol index instead.

Linking.md: Handling relocation against external symbols (e.g. imported data symbols)

In attempting to switch emscripten over to using the llvm wasm backend + lld we've hit a fairly significant road block. The problem is this: Emscripten allows for functions to be imported from JS, but it also allow for data addresses to be imported. See src/library.js:2118. Here we see emscripten allocating some static data space at runtime and then passing the address to the wasm module as a global import.

Currently in Linking.md and in lld, when a data symbol is undefined at static link time any relocations of type R_WEBASSEMBLY_MEMORY_ADDR will write zeros to the target location.

However, both s2wasm and asm2wasm, can handle undefined data symbols at linktime and convert these into get_global XXX rather than i32.const XXX which is used for defined globals.

I talked through some options with @jgravelle-google and @binji and I also spoke Roland McGrath who has worked the GNU tools for linking and loading. The options I see are the following:

Modify emscripten to remove the need to import data symbols into otherwise statically linked binaries.
Have special relocation type that the linker can use to handle both defined globals (via i32.const) and undefined globals (vis get_global)
Ask users to annotate their external global data (with some kind of dll_import attribute), and have the code gen in llvm know to use get_global for such symbols rather than i32.const

Have linking look for `.imports` alongside each `.a`.

Currently we rely on the clang driver adding -allow-undefined-file wasm.syms along with -lc. Without this (if -lc is used on its own) symbols such as syscalls will cause the link to fail.

I propose that we have the linker looks for an optional .imports files alongside each library can can list symbols which that library expects the wasm embedder to provide. We can then rename wasm.syms to libc.imports and remove the -allow-undefined-file from the driver.

relocation issues

Having used the relocations llvm emits I see a few missing things:

the memory indexes work with global indexes; problem here is that a lot of symbols won't have a global (for example in debug info); I temporarily solved this by using a negative global index (-1) with Addend containing the original address.
debug info needs some kind of "index into function"; In my local llvm I hacked in a new relocation type R_WEBASSEMBLY_FUNCTION_OFFSET_I32 which is the offset inside the function block for this.

Linking.md: How to handle calls to weakly defined functions

We need to find a way to handle the following pattern, specifically in case where foo() is not defined at link time:

int foo() __attribute__((weak));

int main() {
  if (&foo) {
    return foo();
  }
  return 0;
}

Currently llvm will generate a call instruction to foo and a relocation at the call site to inject the function index. If foo turns out to be undefined at link time there is no valid function which the linker can insert.

@NWilson and I have discuess a few possible options (see https://reviews.llvm.org/D44028):

Have the linker error out. This seems bad as the above code is valid and should link.
Have the linker generate a synthetic function that abort (i.e. contains only an unreachable) and use the index of the synthetic function here. The doenside is that this adds complexity to the linker and that it would need to generate a new function for each signature.
Have llc always emit call_indirect when calling weak references. Then bad FUNCTION_INDEX relocations against weak symbols in the object format.

(3) seems like the nicest so long as it is feasilbe in llvm and that use increased use of call_indirect doesn't cause a performance drop. My guess is that weak symbols are not that common, but it also seems valid to assume that declaring a symbol as weak should not have a runtime overhead.

Create an explicit symbol table

In the current spec and llvm implementation we are modeling the symbol table based on imports and exports of functions and globals.

While this has worked fairly well it has caused us a few headaches. Specifically around weak symbols which we currently model as both and import and an export.

Also, the modeling of data symbols as wasm globals doesn't make much sense. We are using wasm globals here only to link names to addresses (not for use in set_global/get_global instructions which is their purpose).

We then add extra metadata via the "syminfo" subsection of the "linking" section.

Rather than trying to continue to build an implicit symbol table from imports and exports I propose that we bite the bullet and just be explicit about it. This would be things clearer and allow us to completely remove the imports and exports of globals which we currently use to model data addresses.

My proposal would be to add a new "symtab" section (or subsection?) that would replace the current "syminfo" subsection. Every symbol in the object would require an entry in this table. For data symbols the name would be included directly in the table. For function symbols the name could refer to function import or export so avoid duplicating the name. The symbol index would then be explicit and all relocations would then refer to symbols in this table. The exception would be type relocations for which there is no name or symbol yet.. but we could consider making symbols for types too.

"Merging Data Sections" in Linking.md is out of date

See https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md#merging-data-sections.

Speaking with @sbc100, it seems like this section is no longer correct. For example, it refers to R_DATA which no longer exists, and it mentions using get_global where the current model has different relocations depending on where the memory address is used. It probably should be updated.

Emscripten and the Producers Section

After a lot of consideration, I don't think we want to change Emscripten to emit the producers section by default in release builds. Posting this issue to note that and explain why.

It would be surprising if emcc started one day to emit information like what compiler and tools our users use - most people would probably be surprised by that, and some might be concerned about it, for privacy or security reasons.
There is no precedent for this in the Web space: minifiers don't emit a // minified by $MINIFERNAME comment in your minified code. I can't even find a flag to do this optionally in any minifiers.
- Oddly there is precedent in the native space, as clang and gcc do emit the compiler name and version, but after much digging I never found a good reason for this (just a 20 year old comment about SVR4 compatibility). Most links online about this are users that find out about it and are surprised and sometimes unhappy (about code size, privacy, various bugs that it causes), and many projects end up stripping it out.
When users ask emcc to emit the smallest binary, we should do that, unless there's a very strong reason, and I'm not sure users would agree that the metrics we are talking about here are reason enough, based on discussions I had with some of them. The metrics are mostly of interest to browser vendors and tool creators, but it's the users that ship the extra bytes. Many users will probably flip a flag to remove those bytes, if they heard about that flag's existence, which suggests it should be on by default.

These are reasons for specifically not emitting the producers section in emcc by default. We could add an option for users that do want to do so, and of course other tools may have different factors to consider (in particular, Emscripten is used by ordinary developers, while tools like LLVM, wabt, or binaryen are used by toolchain developers, so the considerations might be different).

ABI for C functions without prototypes

If you have a K&R-style C function declaration, such as int foo(); currently LLVM will lower calls to it with the usual fixed-arg calling convention. Binaryen's s2wasm will leave the calls untouched, and generate foo's function section entry using the type of foo's implementation. If there is a mismatch, then the linked module fails validation.

We should specify in the ABI what is supposed to happen, and LLVM and lld should implement that. The wasm ABI is somewhat unique in that the vararg calling convention is incompatible with the fixed-arg calling convention. Based on previous discussion I think we are still happy with that decision, so that means that at high level there are 2 options here:

All calls to functions with no prototype use the vararg calling convention.
- As a result the signatures of the imports of such functions in the callers' object files all just take a single i32 (the vararg buffer).
- Of course if the implementation of the function is not vararg, then there will be a link-time/validation failure or a runtime failure (if the wasm signature happens to match).
Calls to such functions use the fixed-arg calling convention.
- The signatures in a caller's object file will have whatever arguments are used at the callsite. (If multiple callsites in the object file have different arguments, that would be a compile-time error).

In either case, the linker would check the signatures of all of the the imports for a particular function against the implementation of that function (i.e. the function signature in the object that exports it), and would issue an error if there was a mismatch. Also in either case, a mismatch at the source level could accidentally match at the wasm level, and result in a more-difficult-to-debug runtime failure.

I had thought that option 1 would be easier to specify in an ABI, but now I'm not so sure. Basically everything that needs to be specified (e.g. how does a C function signature lower to a wasm signature, what signature gets put in the import section of an object file, how the linker resolves mismatches, etc) has to be specified either way, and furthermore most of the behaviors will be the same either way. As far as I can see, the only difference is whether we use the vararg or fixed-arg convention. Given that, I'm actually more inclined just to use the fixed-arg convention on the grounds that it's almost always what people actually want.

Thoughts?

Referring to other sections in linking metadata

There is currently only one place where we need to do this: In the reloc sections when we specify the section to which the relocations are to be applied: https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md#relocation-sections

But we are about to add another for WASM_SYMBOL_TYPE_SECTION: https://reviews.llvm.org/D44184

In the existing code the method used to refer to another sections is by "section code + optional string if code is 0". This has several downsides:

it means you can't have multiple custom sections with the same name
it makes the code and API messy/complex
its a waste of space to duplicate these strings in the binary

To see how to makes the code complex look at the proposed to change WasmSymbolInfo. It would be much nicer to simpy reuse the existing ElementIndex field here.

So I propose that we refer to sections instead by their index within the module. This would be a new index space used only my the linking metadata fields.

The only downside to doing this AFAICT is that it would mean tools that blindly insert/remove sections in the middle of the file would break that metadata sections. Looks at the ELF spec, they already have this problem and the tools are built to rewrite index when they do that kind of thing.

Pass small structs in parameters instead of memory

Currently clang will pass any structure with more than a single element indirectly (as LLVM byval). Many other ABIs pass small structures in registers, potentially improving performance. We could do the same thing, but choosing the upper limit on the size (and therefore the number of registers to use) is a little more interesting, since we don't know how many physical registers the underlying machine will have (let alone have available for passing arguments). But it's also true that linear memory references may be more expensive if there is bounds checking.
It seems worthwhile to at least use 2 params since most calling conventions are likely to be able to handle that in most situations, and pairs are fairly common.

If we do this it will make the ABI a lot more complex to specify, since there will be classification rules about when this would apply (both AMD64 SysV and AAPCS are fairly complex in this regard). But it seems worth it. We would probably need slightly different rules for promotion than those architectures since we have both 32-bit and 64-bit argument types.

I'll think a little more about the details, but does anyone have strong opinions about whether this is likely to be a good idea?

Import attribute

Hello,

As there is a reflection about a dedicated export attribute in #64, I wondered if there would be any interest for an import attribute?

// env.h
__attribute__((wasm_import("env"))) void print_i32(int32_t);
__attribute__((wasm_import("env"))) void print_i64(int64_t);
...

// test.c
#include <env.h>
__attribute__((wasm_export)) void hello() { print_i32(789); }

Thanks,
JB.

Create an ABI version for wasm object format

@sunfishcode proposed that we add an ABI version to the wasm object format.

In fact we probably want several concepts of version:

An ABI version. This will allows the linker issue warning when we make changes to the C ABI. This information can come from the clang front-end and probably be stored as metadata in the bit code.
Feature flags: To allow optional features in the ABI and for these optional features to be recording in the object format
A version for the linking metadata. When we make changes to https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md we would like the linker to be able issue warnings or errors on incompatible object files.

Handling BSS data (can we remove DataSize from linking metadata?)

We currently proposed that BSS be encoded by simply leaving gaps, in the memory not filled by data segments. Trailing bss would then be handled by a cap at the end and specified using the DataSize field in the linking metadata.

However, we llvm doesn't' currently implement it this way and always encodes bss as actual zeros in a data segment.

I think that for wasm we have can't rely on non-initialized data to represent bss, becasue we can't really host providing clean memory. So for now I propose that we just handle bss like any other data (this is what llvm does today anyway), and in the future we could optimize by adding a flag to a data segment to specify that its zero initialized, or some other approach (perhaps explicitly using the proposed bulk memory operations).

As a result of this I propose that we remove the DataData field from the linking metadata:
https://reviews.llvm.org/D41366

webassembly / tool-conventions Goto Github PK

tool-conventions's Introduction

WebAssembly Tool Conventions

tool-conventions's People

Contributors

Stargazers

Watchers

Forkers

tool-conventions's Issues

std::atomic_thread_fence / LLVM's fence syncscope("<target-scope>")

std::atomic_signal_fence / LLVM's fence syncscope("singlethread")

asm volatile(""::: "memory")

Function Imports

Address Imports

Function Imports

Address Imports

First Problem - function pointers

Second problem - function relocations

Recommend Projects

Recommend Topics

Recommend Org

`std::atomic_thread_fence` / LLVM's `fence syncscope("<target-scope>")`

`std::atomic_signal_fence` / LLVM's `fence syncscope("singlethread")`

`asm volatile(""::: "memory")`