censoredusername / dynasm-rs Goto Github PK
View Code? Open in Web Editor NEWA dynasm-like tool for rust.
Home Page: https://censoredusername.github.io/dynasm-rs/language/index.html
License: Mozilla Public License 2.0
A dynasm-like tool for rust.
Home Page: https://censoredusername.github.io/dynasm-rs/language/index.html
License: Mozilla Public License 2.0
Look into immediate optimization.
For our experimental JIT compiler, which @vext01 mentioned in #47, we want to compile the instructions of a trace backwards as that seems to be what other JITs do. One of the reasons being that it simplifies register allocation (see [1]).
Is that something that would be possible using dynasm-rs? I've seen that there's a function alter
which allows to patch the buffer, but I don't think that allows us to reverse the compiled instructions. One way I can imagine doing this would be to cache the result of each dynasm!
macro call. Then, when we are done, we simply need to reverse the order of the cached results, before making them executable. I assume that this is not currently possible, but was wondering if there another way that dynasm-rs could be used to achieve this? If not, would you be happy to accept a change that makes this possible?
Hi!
Let me file a (minor) bug.
In the README file there is the link to the documentation, that responds with 404 currently:
## Documentation
[Documentation](https://CensoredUsername.github.com/dynasm-rs/language/index.html).
Expected result: the link should open the documentation.
I copied the initial tutorial code from here, (which is the same as the one in the tree) into a "cargo init --bin" tree, added dependencies on dynamic 0.2.3, installed nightly rust, and ran cargo build. I got the following compile errors:
Compiling test v0.1.0 (/home/me/test)
warning: unused `#[macro_use]` import
--> src/main.rs:4:1
|
4 | #[macro_use]
| ^^^^^^^^^^^^
|
= note: #[warn(unused_imports)] on by default
error[E0599]: no method named `global_label` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope
--> src/main.rs:16:5
|
16 | / dynasm!(ops
17 | | ; ->hello:
18 | | ; .bytes string.as_bytes()
19 | | );
| |______^
error[E0599]: no method named `extend` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope
--> src/main.rs:16:5
|
16 | / dynasm!(ops
17 | | ; ->hello:
18 | | ; .bytes string.as_bytes()
19 | | );
| |______^
error[E0599]: no method named `offset` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope
--> src/main.rs:21:21
|
21 | let hello = ops.offset();
| ^^^^^^
error[E0599]: no method named `extend` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope
--> src/main.rs:22:5
|
22 | / dynasm!(ops
23 | | ; lea rcx, [->hello]
24 | | ; xor edx, edx
25 | | ; mov dl, BYTE string.len() as _
... |
30 | | ; ret
31 | | );
| |______^
error[E0599]: no method named `global_reloc` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope
--> src/main.rs:22:5
|
22 | / dynasm!(ops
23 | | ; lea rcx, [->hello]
24 | | ; xor edx, edx
25 | | ; mov dl, BYTE string.len() as _
... |
30 | | ; ret
31 | | );
| |______^
error[E0599]: no method named `push_i8` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope
--> src/main.rs:22:5
|
22 | / dynasm!(ops
23 | | ; lea rcx, [->hello]
24 | | ; xor edx, edx
25 | | ; mov dl, BYTE string.len() as _
... |
30 | | ; ret
31 | | );
| |______^
error[E0599]: no method named `push_i64` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope
--> src/main.rs:22:5
|
22 | / dynasm!(ops
23 | | ; lea rcx, [->hello]
24 | | ; xor edx, edx
25 | | ; mov dl, BYTE string.len() as _
... |
30 | | ; ret
31 | | );
| |______^
error[E0599]: no method named `finalize` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope
--> src/main.rs:33:19
|
33 | let buf = ops.finalize().unwrap();
| ^^^^^^^^
error: aborting due to 8 previous errors
Some way of specifying register size, but not allocation should be supported. While the parser side will be simple enough, the code gen will be a bit harder when the REX prefix is involved as the register encoding can be split up across several bytes.
Another issue is the special treatment of the RBP, RSP, RIP and the AH-DH registers. Due to their use in escaping in the ModRM/SIB bytes. Accurate codegen for this will be significantly more complex than what's currently generated (requiring complex conditionals). Furthermore complex checking will have to be done at runtime to figure out if the wanted register combination is even possible.
It would be extremely helpful to be able to dump a text version of the assembly to stdout as the code is being generated. Is this currently possible to do? If not, is this something that could be planned in the future?
thx
loop and in are broken right now (Rust keywords)
Instruction | Current | Corrected |
---|---|---|
vucomiss | yoyo/yomd | ydyd/ydmd |
vucomisd | yoyo/yomq | yqyq/yqmq |
Current | Corrected |
---|---|
cvtpd2dS | cvtpd2ps |
vcvtpd2dS | vcvtpd2ps |
palign | palignr |
vpalign | vpalignr |
pblenddw | pblendw |
vpblenddw | vpblendw |
vfmaddsuppd | vfmaddsubpd |
Trying to follow the tutorial/example, I was getting the following error:
error: dynasmtest/target/debug/deps/libdynasm-c463bfe8527c932a.so: undefined symbol: __rustc_plugin_registrar_aa7442ab6739c353d00149a1c4ce6dd__
--> src/main.rs:2:11
|
2 | #![plugin(dynasm)]
| ^^^^^^
In order to get dynasm-rs working I had to change the imports as follows:
#![feature(proc_macro_hygiene)]
extern crate dynasmrt;
#[macro_use]
extern crate dynasm;
I'm not sure if this an issue with my setup or whether this is due to dynasm-rs transitioning from a compiler plugin to proc macros, and the tutorial just hasn't been updated yet.
Is there a plan to make the crate work on stable Rust at some point, i.e. using procedural macros (either via https://github.com/dtolnay/proc-macro-hack or using proper expression macros when they eventually arrive)?
I would like to use dynasm-rs for a JIT for modular sound synthesis, and I don't mind depending on unstable Rust to begin with, but it would be nice to have some assurance that it will work on stable eventually.
Travis decided to break everything again, which is slightly annoying.
Current | NASM |
---|---|
hword | yword |
pword | tword |
mmx0 | mm0 |
This bug contains a listing of improvements to be made to the library, sorted by priority.
General
High
Medium
Low
Assembly dialect specific
x64
Is it possible to like... userspace "trace" an .exe/.dll all the way from its entry point with this library?
Hi,
I'm trying to generate some code like:
dynasm!(self.a
; .arch aarch64
; add X(31), X(1), 1
);
Dynasm successfully assembles this without reporting an error. However, the assembled code is:
add sp, x1, #0x1
Which is incorrect according to the documentation that specifies only the dynamic encoding prefix XSP
should encode a sp
operand.
Is something like add xzr, x1, #0x1
unencodable? If that's the case, maybe dynasm should return an error instead of generating sp
silently.
I'm using dynasm-rs to generate shellcode and inject to another process, the memory block for code is in remote process allocated via VirtualAllocEx
. if I use default Assembler
its memory is allocated by Rust, Should I use VecAssembler
and supply base address in remote process manually?
On the latest nightly (1.35) the following error is thrown by the compiler:
error[E0308]: mismatched types
--> /home/jef/.cargo/registry/src/github.com-1ecc6299db9ec823/dynasm-0.2.3/src/lib.rs:51:64
|
51 | allow_internal_unstable: false,
| ^^^^^ expected enum `std::option::Option`, found bool
|
= note: expected type `std::option::Option<std::rc::Rc<[syntax::ast::Symbol]>>`
found type `bool`
This has been broken since at least March 1st, but works with the nightly from February 1st.
currently we assume all displacements are 32 bits. however, we can generate more code by allowing people to specify that their displacement fits in 8 bits. NASM uses the following syntax for this feature:
inc [BYTE rax + 1]
Using rustc 1.28.0-nightly (a3085756e 2018-05-19), I get the following when I try to build dynasm:
error[E0063]: missing field `edition` in initializer of `syntax::ext::base::SyntaxExtension`
--> /Users/tov/.cargo/registry/src/github.com-1ecc6299db9ec823/dynasm-0.2.0/src/lib.rs:44:35
|
44 | SyntaxExtension::NormalTT {
| ^^^^^^^^^^^^^^^^^^^^^^^^^ missing `edition`
This is the same for dynasm 0.1.4 and 0.2.0.
Ref: https://censoredusername.github.io/dynasm-rs/language/langref_aarch64.html
x86
and x64
assembling backend"modifier
and vector_reg_name
which (generally) match reality for AArch64, but the prefix
, static_reg_name
, and dynamic_reg_name
parts only make sense in x86/x86-64Separately, I have a few suggestions about it:
and
/orr
/orn
/eor
/eon
) only support the standard shifts (lsl
/lsr
/asr
/ror
) and only with immediate offsets.Currently no way exists to use labels as immediates/memory locations in generated code. label arguments are only supported on instructions taking code arguments. This can probably be solved by just generalizing labels as immediates, although significant modifications to the parser will be necessary to use them as memory locations (especially when used as part of an immediate expression).
I'm building a JIT that runs on bare metal and therefore std
is not available. Is it possible to make dynasm-rs
compatible with the no_std
mode (with alloc
), as long as OS-dependent things like executable memory management are left to the user?
Hi! I'm probably missing something like I always do; however I'm trying to assemble the following snippet, but I get a Errors were encountered when committing before finalization: ImpossibleRelocation(Global("hello"))
error.
use dynasmrt::{dynasm, DynasmLabelApi};
let mut ops = dynasmrt::x86::Assembler::new().unwrap();
dynasm!(ops
; .arch x86
; lea eax, [->hello]
; ->hello:
; .bytes "hello".as_bytes()
);
println!("{:?}", ops.labels().resolve_global("hello")); // Ok(AssemblyOffset(6))
ops.finalize().unwrap();
I expected [->hello]
to be resolved as 6
... what am I doing wrong?
Currently there's no clear way to get pointers to executable data after encoding has been done. A clear API is needed that facilitates encoding any offsets, committing the result to executable memory, possibly updating this memory, recording offsets into this memory to later create callable/jumpable pointers with (or alternatively, an API to query the location of labels with from outside dynasm blocks).
Currently, any features which span across dynasm blocks have to be evaluated at runtime as each macro invocation is treated seperately. If analysis of the entire program is used, the following features become possible:
One problem with whole program analysis is that, when many different AssemblingBuffers are used, each buffer will allocate the memory necessary for all used identifiers.
Found that when writing code like ; mov QWORD [buffer + offset as i32], temp
where offset is a immediate value this crate will unconditionally emit relative instructions. When explicitly checking for when offset is zero and emitting ; mov QWORD [buffer], temp
reduces my emitted code significantly (~10%).
Maybe this crate could check if immediate values are zero and choose to emit smaller instructions.
error[E0308]: mismatched types --> /data/work/dynasm-rs/plugin/src/lib.rs:83:46 | 83 | let token_tree = ecx.expander().fold_tts(token_tree); | ^^^^^^^^^^ expected struct
syntax::tokenstream::TokenStream, found reference | = note: expected type
syntax::tokenstream::TokenStreamfound type
&[syntax::tokenstream::TokenTree]`
error[E0308]: mismatched types
--> /data/work/dynasm-rs/plugin/src/lib.rs:85:46
|
85 | let mut parser = ecx.new_parser_from_tts(&token_tree);
| ^^^^^^^^^^^ expected slice, found struct syntax::tokenstream::TokenStream
|
= note: expected type &[syntax::tokenstream::TokenTree]
found type &syntax::tokenstream::TokenStream
error: aborting due to 2 previous errors
error: Could not compile dynasm
.`
rustc --version
rustc 1.17.0-nightly (6eb9960d3 2017-03-19)
Each implementer of DynasmLabelApi defines a Relocation type which should be used by the plugin to inform the runtime what kind of relocation is needed. Unfortunately, currently it is not possible to construct the associated type properly in the plugin code generation stage as is it only aware of the name that the assembler struct is bound to, and not the correct associated type. While future additions to rust might make this possible (in a typeof
or decltype
like manner), currently this means that the Relocation type is limited to being a simple type like a tuple of primitives, and that invalid values in these will cause an error at runtime.
The first error is that the Assembler class has been moved to x64::Assembler.
Now, I get errors like
error: no method named `global_label` found for type `dynasmrt::x64::Assembler` in the current scope
error: no method named `offset` found for type `dynasmrt::x64::Assembler` in the current scope
when trying to run the example code.
In 64-bit mode, one would expect the immediate argument to mov r64, ...
to be imm64
, but it appears to be imm32
. There is apparently no efficient way to set a register to a 64-bit value. AFAICT Rust's asm!
macro supports this fine.
The movabs
extension is supported with a 64-bit immediate operand but only if the destination is rax.
When trying to use instruction repne scasb
I get the following error:
Cannot use prefix repne on this instruction
A quick glance at the code shows that it might be due to a typo here (REP used instead of ¿REPE?), though I am not sure that I follow what this does exactly
The VecAssembler
is part of the public interface, but there doesn't seem to be a way to construct or obtain one.
Hi!
I'm using dynasmrt
to generate code at runtime and make a remote process execute it.
I actually need the conversion from text to bytes.
I couldn't find a way to make this simple snippet to work.
dynasm!(assembler
; .arch aarch64
// CREATE DIR
; mov x8, 0x22
; mov x0, 0
; ldr x1, ->dir_name
; mov x2, 0
; svc 0
// EXIT
; mov x8, 0x5d
; mov x0, 0
; svc 0
; ->dir_name:
; .bytes "/data/local/tmp/TEST_DIR\0".as_bytes()
);
I already tried this on a x86-64 machine (using the x86-64 instruction set) and it worked without problems.
That ldr
instruction seems a bit off, shouldn't it be ldr x1 ...
?
strace
view:
mkdirat(0, "\1", 000) = -1 ENOTDIR (Not a directory)
exit(0) = ?
Generated bytes:
480480D2
000080D2
18000058 // ldr x24, #0x18 - shouldn't it be ldr x1, #0x18 [81000058]?
020080D2
010000D4
A80B80D2
000080D2
010000D4
2F646174
612F6C6F
63616C2F
746D702F
54455354
5F444952
For instance, the same snippet
mov x8, 0x22
mov x0, 0
ldr x1, =dir_name
mov x2, 0
svc 0
mov x8, 0x5d
mov x0, 0
svc 0
dir_name:
.ascii "/data/local/tmp/TEST_DIR\0"
compiled and linked with as
and ld
(bundled with the NKD
toolchain), works.
Am I generating the code correctly?
In the documentation macros are used to create generic code, however macros tend to become unwieldy as the code inside grows and needs more information. It would be nice if there was a way to be able to reference a register in a way that is generic at the type level, or at least some way to reference a register from a variable.
fn popcnt<Reg: Register>(asm: Assembler) {
dynasm!(asm
; popcnt Reg(5), [rsp]
);
}
popcnt::<Rq>(asm);
popcnt::<Rd>(asm);
I'm using Rust nightly 1.27 and dynasm does not compile in it :(
rustc 1.27.0-nightly (7360d6dd6 2018-04-15)
Compiling dynasm results in:
Compiling dynasm v0.1.4
error[E0609]: no field `identifier` on type `&syntax::ast::PathSegment`
--> /home/bruno/.cargo/registry/src/github.com-1ecc6299db9ec823/dynasm-0.1.4/src/arch/x64/parser.rs:550:31
|
550 | Some(Ident {node: segment.identifier, span: path.span})
| ^^^^^^^^^^
error: aborting due to previous error
For more information about this error, try `rustc --explain E0609`.
error: Could not compile `dynasm`.
To learn more, run the command again with --verbose.
It would be nice to have a map of supported rust versions.
PS: I know the cause of this error is probably the instability of rustc API
Currently, dynasm-rs supports setbe, seta, setae, setl, setle, setg, setge, sete and setne but not setb. There's no reason for it not to, it just fails if you try to emit it. As far as I can tell setc
and setnae
are precisely equivalent, but it still means that intent is harder to express than it should be. Other synonyms are supported, there's no reason for this one not to be.
I get a bunch of errors saying:
check
is private.local_inner_macros
in initializer of syntax::ext::base::SyntaxExtension
.parameters
on type &syntax::ast::PathSegment
.I'm guessing something changed in syntax
recently, but I might just be doing something stupid.
Edit: The README says that only "rustc 1.28.0-nightly (a1d4a9503 2018-05-20" is guaranteed to work, so I guess this is expected.
it is not compiling on 1.26.0-nightly (e5277c145 2018-03-28)
i fixed it like this. If it is ok i'll open a pull req.
~/d/plugin ❯❯❯ cargo build
Updating registry `https://github.com/rust-lang/crates.io-index`
Compiling stable_deref_trait v1.0.0
Compiling bitflags v0.9.1
Compiling lazy_static v0.2.11
Compiling owning_ref v0.3.3
Compiling dynasm v0.1.3 (file:///home/edstefes/Projects/dynasm-rs/plugin)
error[E0063]: missing field `unstable_feature` in initializer of `syntax::ext::base::SyntaxExtension`
--> src/lib.rs:44:35
|
44 | SyntaxExtension::NormalTT {
| ^^^^^^^^^^^^^^^^^^^^^^^^^ missing `unstable_feature`
error[E0023]: this pattern has 1 field, but the corresponding tuple variant has 2 fields
--> src/arch/x64/parser.rs:380:9
|
380 | token::Ident(ast::Ident {ref name, ..}) if &*name.as_str() == kw => (),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected 2 fields, found 1
error[E0023]: this pattern has 1 field, but the corresponding tuple variant has 2 fields
--> src/arch/x64/parser.rs:388:12
|
388 | if let token::Ident(i) = parser.token {
| ^^^^^^^^^^^^^^^ expected 2 fields, found 1
error: aborting due to 3 previous errors
Some errors occurred: E0023, E0063.
For more information about an error, try `rustc --explain E0023`.
error: Could not compile `dynasm`.
To learn more, run the command again with --verbose.
This is currently blocked by advancements that still need to be made in rustc itself (primarily the impossibility to expand macros against token trees), which makes it impossible to expand macros before parsing. It should already be possible to expand macro's inside expressions though, but this will only allow them to expand into single arguments instead of whole chunks of asm.
I'm trying to alter a small part (~60 bytes) of code in a ~10MB Assembler
. It is pretty slow on AArch64, taking around 200 microseconds.
By digging into the implementation a bit I found that the underlying make_exec method always calls mprotect
on the entire executable memory. I haven't looked into the kernel mprotect
implementation, but I assume it will invalidate instruction caches & TLBs for the entire region too since this is required to enforce memory protections.
Is it possible to track modifications dynamically inside alter
, and make mprotect changes on-demand?
It seems that Assembler::alter
(which uses mprotect
) does not flush the instruction cache after code has changed, and subsequent execution sometimes still uses the old code.
This issue can only be reproduced on a real AArch64 CPU (qemu-aarch64 on x86-64 machines works fine).
Adding explicit ic ivau
instructions after calling .alter
fixes this.
On macOS, the recommended way to implement a JIT system is by creating the memory map with PROT_WRITE | PROT_EXEC
and the MAP_JIT
flag, then using pthread_jit_write_protect_np
to switch between writing and executing the buffer.
(This is kinda weird, because the W^X behavior is tracked on a per-thread basis, rather than per-region; I found it easiest to only enable W right before copying into the region, then disable it afterwards)
Anyways, it turns out that this is much faster than using mmap
to swap regions from PROT_WRITE
to PROT_EXEC
!
Here's a flamegraph using mmap
(note the calls to mprotect
and memmove
taking up a good chunk of time)
Here's what it looks like with pthread_jit_write_protect_np
(those calls are gone, and pthread_jit_write_protect_np
doesn't even show up)
I see one benchmark go from 112 ms down to 62 ms, almost a 50% improvement!
(My benchmarks are admittedly weird, in that they compile a lot of very small functions 😆)
This requires ditching / forking the memmap2
crate, which doesn't support this behavior. Here's how I did it.
Right now, it's easy for users to do this on their own: I'm using a VecAssembler
then copying into this custom struct Mmap
, which works fine. Still, this would be a decent optimization for the stock Assembler
.
As always, the dynasm-rs
is great, and I really appreciate the work that went into it!
Currently there are no real tests except from a general feature checking file.
The instruction encoding data in particular really needs to be checked.
I know you are aware of this @CensoredUsername, however cargo audit
has just recently started complaining about it:
Crate: memmap
Version: 0.7.0
Warning: unmaintained
Title: memmap is unmaintained
Date: 2020-12-02
ID: RUSTSEC-2020-0077
URL: https://rustsec.org/advisories/RUSTSEC-2020-0077
Dependency tree:
memmap 0.7.0
├── ykview 0.1.0
├── yktrace 0.1.0
│ ├── ykview 0.1.0
│ ├── ykrt 0.1.0
│ └── ykcompile 0.1.0
│ └── ykrt 0.1.0
└── dynasmrt 1.0.0
└── ykcompile 0.1.0
I think you said you had another library in mind. Is it time to switch to it perhaps?
We're at the Rust All Hands and want to compile HolyJIT. This is super urgent. (Not.)
Currently it points to https://censoredusername.github.com/dynasm-rs/language/index.html, but it should point to https://censoredusername.github.io/dynasm-rs/language/index.html (note the TLD) like it does in the README.
Currently only long mode is supported, while it should be possible to also support protected mode with modifications to the encoding infrastructure
trying to use
dynasm!(self ; and w0, w0, 255);
Will lead to
thread '<unnamed>' panicked at 'attempt to shift left with overflow', /home/pi/.cargo/registry/src/github.com-1ecc6299db9ec823/dynasmrt-1.2.0/src/aarch64.rs:208:27
(the element_size is 32)
0xff can be converted to Logical Immediate. It should lead to immr==0 and imms==7
I'm not deeply familiar with assembly, assemblers or compilers, but I believe that one optimisation that might be useful to do in dynasm
(and that isn't really possible to do as a user of the library) is branch relaxation. Essentially, this means using the branch instructions that take a PC-relative offset of 8 bits or 16 bits when possible instead of always using pointer-sized absolute immediate jumps, with the reasoning that these are faster (and smaller?) for the majority-case of doing a same-function jump of only a few instructions. AFAICT this isn't done by dynasm
right now, it just always does a static jump to an immediate of a fixed size, and then patches the value of the immediate only.
Of course, as I said I'm not too familiar with assemblers and so it's possible that these non-PC-relative jumps are always faster for non-PIE code and the 8-bit PC-relative jumps are only generated by LLVM etc. because they need to generate PIEs, whereas dynasm
does not.
I noticed that hashmap lookups are taking a decent amount of JIT time when using local labels.
For example, if I manually compute jumps in this code:
dynasm!(ops
// Basically the same as MinRegReg
; zip2 v4.s2, V(lhs_reg).s2, V(rhs_reg).s2
; zip1 v5.s2, V(rhs_reg).s2, V(lhs_reg).s2
; fcmgt v5.s2, v5.s2, v4.s2
; fmov x15, d5
; tst x15, #0x1_0000_0000
; b.ne >lhs
; tst x15, #0x1
; b.eq >both
// LHS < RHS
; fmov D(out_reg), D(rhs_reg)
; mov w16, #CHOICE_RIGHT
; b >end
// RHS < LHS
;lhs:
; fmov D(out_reg), D(lhs_reg)
; mov w16, #CHOICE_LEFT
; b >end
;both:
; fmax V(out_reg).s2, V(lhs_reg).s2, V(rhs_reg).s2
; mov w16, #CHOICE_BOTH
;end:
; strb w16, [x0], #1 // post-increment
)
I end up with something like this:
dynasm!(ops
; zip2 v4.s2, V(lhs_reg).s2, V(rhs_reg).s2
; zip1 v5.s2, V(rhs_reg).s2, V(lhs_reg).s2
; fcmgt v5.s2, v5.s2, v4.s2
; fmov x15, d5
; tst x15, #0x1_0000_0000
; b.ne #24 // -> lhs
; tst x15, #0x1
; b.eq #28 // -> both
// LHS < RHS
; fmov D(out_reg), D(rhs_reg)
; mov w16, #CHOICE_RIGHT
; b #24 // -> end
// <- lhs (when RHS < LHS)
; fmov D(out_reg), D(lhs_reg)
; mov w16, #CHOICE_LEFT
; b #12 // -> end
// <- both
; fmax V(out_reg).s2, V(lhs_reg).s2, V(rhs_reg).s2
; mov w16, #CHOICE_BOTH
// <- end
; strb w16, [x0], #1 // post-increment
)
In my codebase, this reduces the time spent in dynasm
by about 30%, which is a decent chunk of performance!
It would be great to introduce a new flavor of label which is only valid during a single dynasm!
block; the branch offset could then be computed at compile-time instead of runtime.
This is currently due to a run-time limitation (it is assumed that a relocation is always at the end of an instruction) which is not true in case of RIP-relative addressing with an immediate.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.