censoredusername / dynasm-rs Goto Github PK

View Code? Open in Web Editor NEW

685.0 685.0 49.0 7.47 MB

A dynasm-like tool for rust.

Home Page: https://censoredusername.github.io/dynasm-rs/language/index.html

License: Mozilla Public License 2.0

Rust 80.53% Shell 0.23% Python 19.24%

assembly dynasm jit rust

dynasm-rs's People

Contributors

Stargazers

Watchers

dynasm-rs's Issues

Immediate encoding

Look into immediate optimization.

Building assembler code in reverse

For our experimental JIT compiler, which @vext01 mentioned in #47, we want to compile the instructions of a trace backwards as that seems to be what other JITs do. One of the reasons being that it simplifies register allocation (see [1]).

Is that something that would be possible using dynasm-rs? I've seen that there's a function alter which allows to patch the buffer, but I don't think that allows us to reverse the compiled instructions. One way I can imagine doing this would be to cache the result of each dynasm! macro call. Then, when we are done, we simply need to reverse the order of the cached results, before making them executable. I assume that this is not currently possible, but was wondering if there another way that dynasm-rs could be used to achieve this? If not, would you be happy to accept a change that makes this possible?

[1] https://dl.acm.org/doi/pdf/10.1145/3132190.3132209

Broken link to the documentation

Hi!

Let me file a (minor) bug.

In the README file there is the link to the documentation, that responds with 404 currently:

## Documentation

[Documentation](https://CensoredUsername.github.com/dynasm-rs/language/index.html).

Expected result: the link should open the documentation.

Tutorial doesn't compile

I copied the initial tutorial code from here, (which is the same as the one in the tree) into a "cargo init --bin" tree, added dependencies on dynamic 0.2.3, installed nightly rust, and ran cargo build. I got the following compile errors:

   Compiling test v0.1.0 (/home/me/test)                                                                                                                 
warning: unused `#[macro_use]` import                                                                                                                         
 --> src/main.rs:4:1                                                                                                                                          
  |                                                                                                                                                           
4 | #[macro_use]                                                                                                                                              
  | ^^^^^^^^^^^^                                                                                                                                              
  |                                                                                                                                                           
  = note: #[warn(unused_imports)] on by default                                                                                                               
                                                                                                                                                              
error[E0599]: no method named `global_label` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope              
  --> src/main.rs:16:5                                                                                                                                        
   |                                                                                                                                                          
16 | /     dynasm!(ops                                                                                                                                        
17 | |         ; ->hello:                                                                                                                                     
18 | |         ; .bytes string.as_bytes()                                                                                                                     
19 | |     );                                                                                                                                                 
   | |______^                                                                                                                                                 
                                                                                                                                                              
error[E0599]: no method named `extend` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope                    
  --> src/main.rs:16:5                                                                                                                                        
   |                                                                                                                                                          
16 | /     dynasm!(ops                                                                                                                                        
17 | |         ; ->hello:                                                                                                                                     
18 | |         ; .bytes string.as_bytes()                                                                                                                     
19 | |     );                                                                                                                                                 
   | |______^                                                                                                                                                 
                                                                                                                                                              
error[E0599]: no method named `offset` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope                    
  --> src/main.rs:21:21                                                                                                                                       
   |                                                                                                                                                          
21 |     let hello = ops.offset();                                                                                                                            
   |                     ^^^^^^                                                                                                                               
                                                                                                                                                              
error[E0599]: no method named `extend` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope                    
  --> src/main.rs:22:5                                                                                                                                        
   |                                                                                                                                                          
22 | /     dynasm!(ops                                                                                                                                        
23 | |         ; lea rcx, [->hello]                                                                                                                           
24 | |         ; xor edx, edx                                                                                                                                 
25 | |         ; mov dl, BYTE string.len() as _                                                                                                               
...  |                                                                                                                                                        
30 | |         ; ret                                                                                                                                          
31 | |     );                                                                                                                                                 
   | |______^                                                                                                                                                 
                                                                                                                                                              
error[E0599]: no method named `global_reloc` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope              
  --> src/main.rs:22:5                                                                                                                                        
   |                                                                                                                                                          
22 | /     dynasm!(ops                                                                                                                                        
23 | |         ; lea rcx, [->hello]                                                                                                                           
24 | |         ; xor edx, edx                                                                                                                                 
25 | |         ; mov dl, BYTE string.len() as _                                                                                                               
...  |                                                                                                                                                        
30 | |         ; ret                                                                                                                                          
31 | |     );                                                                                                                                                 
   | |______^                                                                                                                                                 
                                                                                                                                                              
error[E0599]: no method named `push_i8` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope                   
  --> src/main.rs:22:5                                                                                                                                        
   |                                                                                                                                                          
22 | /     dynasm!(ops                                                                                                                                        
23 | |         ; lea rcx, [->hello]                                                                                                                           
24 | |         ; xor edx, edx                                                                                                                                 
25 | |         ; mov dl, BYTE string.len() as _                                                                                                               
...  |                                                                                                                                                        
30 | |         ; ret                                                                                                                                          
31 | |     );                                                                                                                                                 
   | |______^                                                                                                                                                 
                                                                                                                                                              
error[E0599]: no method named `push_i64` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope                  
  --> src/main.rs:22:5                                                                                                                                        
   |                                                                                                                                                          
22 | /     dynasm!(ops                                                                                                                                        
23 | |         ; lea rcx, [->hello]                                                                                                                           
24 | |         ; xor edx, edx                                                                                                                                 
25 | |         ; mov dl, BYTE string.len() as _                                                                                                               
...  |                                                                                                                                                        
30 | |         ; ret                                                                                                                                          
31 | |     );                                                                                                                                                 
   | |______^                                                                                                                                                 
                                                                                                                                                              
error[E0599]: no method named `finalize` found for type `std::result::Result<dynasmrt::x64::Assembler, std::io::Error>` in the current scope                  
  --> src/main.rs:33:19                                                                                                                                       
   |                                                                                                                                                          
33 |     let buf = ops.finalize().unwrap();                                                                                                                   
   |                   ^^^^^^^^                                                                                                                               
                                                                                                                                                              
error: aborting due to 8 previous errors

Dynamic register allocation

Some way of specifying register size, but not allocation should be supported. While the parser side will be simple enough, the code gen will be a bit harder when the REX prefix is involved as the register encoding can be split up across several bytes.

Another issue is the special treatment of the RBP, RSP, RIP and the AH-DH registers. Due to their use in escaping in the ModRM/SIB bytes. Accurate codegen for this will be significantly more complex than what's currently generated (requiring complex conditionals). Furthermore complex checking will have to be done at runtime to figure out if the wanted register combination is even possible.

Is emitting text output possible?

It would be extremely helpful to be able to dump a text version of the assembly to stdout as the code is being generated. Is this currently possible to do? If not, is this something that could be planned in the future?

thx

Mnemonic corrections

Broken

loop and in are broken right now (Rust keywords)

Additions

fistp - currently only fist is present. Should be fistp for /3 and /7
fucomip
pshufhw - pshuflw is present, but not the packed high words version
vpshufhw - vpshuflw is present, but not the packed high words version

AMD Only

vpermil2 variants were removed by Intel in 2009

Malformed encoding formats

Instruction	Current	Corrected
vucomiss	yoyo/yomd	ydyd/ydmd
vucomisd	yoyo/yomq	yqyq/yqmq

Typos

Current	Corrected
cvtpd2dS	cvtpd2ps
vcvtpd2dS	vcvtpd2ps
palign	palignr
vpalign	vpalignr
pblenddw	pblendw
vpblenddw	vpblendw
vfmaddsuppd	vfmaddsubpd

Unavailable in NASM

monitorx
mwaitx

Example/Tutorial inaccurate?

Trying to follow the tutorial/example, I was getting the following error:

error: dynasmtest/target/debug/deps/libdynasm-c463bfe8527c932a.so: undefined symbol: __rustc_plugin_registrar_aa7442ab6739c353d00149a1c4ce6dd__
 --> src/main.rs:2:11
  |
2 | #![plugin(dynasm)]
  |           ^^^^^^

In order to get dynasm-rs working I had to change the imports as follows:

#![feature(proc_macro_hygiene)]
extern crate dynasmrt;
#[macro_use]
extern crate dynasm;

I'm not sure if this an issue with my setup or whether this is due to dynasm-rs transitioning from a compiler plugin to proc macros, and the tutorial just hasn't been updated yet.

Use on stable Rust?

Is there a plan to make the crate work on stable Rust at some point, i.e. using procedural macros (either via https://github.com/dtolnay/proc-macro-hack or using proper expression macros when they eventually arrive)?

I would like to use dynasm-rs for a JIT for modular sound synthesis, and I don't mind depending on unstable Rust to begin with, but it would be nice to have some assurance that it will work on stable eventually.

Switch to github actions

Travis decided to break everything again, which is slightly annoying.

Enhance NASM compatibility

Renaming

Current	NASM
hword	yword
pword	tword
mmx0	mm0

Long-term planning metabug

This bug contains a listing of improvements to be made to the library, sorted by priority.

General

High
- Modularizing different assembly dialects. Currently the only supported assembly language is x64, both in the plugin and the runtime. For the plugin the assembly language to use should be a setting determined by directives or crate attributes preferably, while for the runtime having each dialect in a separate submodule implementing the same traits would be the way to go. As for internal plugin infrastructure, it seems best to me that different assembly languages have completely separate parsing and compiling implementations (dialects like x86/x64 could share them), but all of them end up producing something that a common serialization module can understand.
- x86 support
- x64 support needs to be tested thoroughly.
- a comprehensive testing framework that not only allows to test the current implementation, but also future extensions with reference tools.
Medium
- Toolchain improvement: as the project consists out of several different crates in subfolders, building and testing everything is less than ergonomic currently.
- Improving the quality of error messages. Mainly the error messages when an instruction isn't found or when an instruction variant doesn't exist are a little bare-bones
- Currently, no checking is done on redundant prefixes or impossible prefixes.
- For several ops (like x64 movsx) default to a certain argument size if the size of an argument isn't correctly specified. This should result in an error.
- ARM support
- Assembler::align is hardcoded to align as to what is required by x64.
- Support in the runtime to keep track of all relocations caused by labels. This is important on the roadmap for x86 support, or x64 code only using a 32-bit address space.
- Review the x64 operand size determination code and how it deals with immediate sizes.
Low
- x64/x86 address displacement size hint
- automatic size optimization for constant immediate / displacement arguments
- AVX-512 support
- MIPS support
Assembly dialect specific
- x64
  - Jumps with 16-bit relative offsets are illegal in long mode on intel CPUs (they work fine on AMD though.).

Example of how to run a PE32 executable/library?

Is it possible to like... userspace "trace" an .exe/.dll all the way from its entry point with this library?

Incorrect instruction encoding on AArch64 when using `xzr` as the destination register

Hi,

I'm trying to generate some code like:

        dynasm!(self.a
            ; .arch aarch64
            ; add X(31), X(1), 1
        );

Dynasm successfully assembles this without reporting an error. However, the assembled code is:

add     sp, x1, #0x1

Which is incorrect according to the documentation that specifies only the dynamic encoding prefix XSP should encode a sp operand.

Is something like add xzr, x1, #0x1 unencodable? If that's the case, maybe dynasm should return an error instead of generating sp silently.

How to choose assembler for shellcode generation

I'm using dynasm-rs to generate shellcode and inject to another process, the memory block for code is in remote process allocated via VirtualAllocEx. if I use default Assembler its memory is allocated by Rust, Should I use VecAssembler and supply base address in remote process manually?

Broken on latest nightly

On the latest nightly (1.35) the following error is thrown by the compiler:

error[E0308]: mismatched types
  --> /home/jef/.cargo/registry/src/github.com-1ecc6299db9ec823/dynasm-0.2.3/src/lib.rs:51:64
   |
51 |                                       allow_internal_unstable: false,
   |                                                                ^^^^^ expected enum `std::option::Option`, found bool
   |
   = note: expected type `std::option::Option<std::rc::Rc<[syntax::ast::Symbol]>>`
              found type `bool`

This has been broken since at least March 1st, but works with the nightly from February 1st.

Address displacement size hint

currently we assume all displacements are 32 bits. however, we can generate more code by allowing people to specify that their displacement fits in 8 bits. NASM uses the following syntax for this feature:

inc [BYTE rax + 1]

Doesn't build on nightly 1.28

Using rustc 1.28.0-nightly (a3085756e 2018-05-19), I get the following when I try to build dynasm:

error[E0063]: missing field `edition` in initializer of `syntax::ext::base::SyntaxExtension`
  --> /Users/tov/.cargo/registry/src/github.com-1ecc6299db9ec823/dynasm-0.2.0/src/lib.rs:44:35
   |
44 |                                   SyntaxExtension::NormalTT {
   |                                   ^^^^^^^^^^^^^^^^^^^^^^^^^ missing `edition`

This is the same for dynasm 0.1.4 and 0.2.0.

Language dialect reference for AArch64 has some stale x86/x86-64 references

Ref: https://censoredusername.github.io/dynasm-rs/language/langref_aarch64.html

Separately, I have a few suggestions about it:

It's worth calling out that the modifiers are only supported on some instructions. Notably, all the basic bitwise boolean instructions (and/orr/orn/eor/eon) only support the standard shifts (lsl/lsr/asr/ror) and only with immediate offsets.
It may be helpful to link to https://developer.arm.com/documentation/ddi0596/2021-12/?lang=en for some additional resources (and likewise https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html and/or https://developer.amd.com/resources/developer-guides-manuals/ for x86-64) just extra visibility (especially those who aren't as familiar with assembly programming).

RIP relative addressing.

Currently no way exists to use labels as immediates/memory locations in generated code. label arguments are only supported on instructions taking code arguments. This can probably be solved by just generalizing labels as immediates, although significant modifications to the parser will be necessary to use them as memory locations (especially when used as part of an immediate expression).

no_std support

I'm building a JIT that runs on bare metal and therefore std is not available. Is it possible to make dynasm-rs compatible with the no_std mode (with alloc), as long as OS-dependent things like executable memory management are left to the user?

Impossible relocation on x86

Hi! I'm probably missing something like I always do; however I'm trying to assemble the following snippet, but I get a Errors were encountered when committing before finalization: ImpossibleRelocation(Global("hello")) error.

use dynasmrt::{dynasm, DynasmLabelApi};

let mut ops = dynasmrt::x86::Assembler::new().unwrap();

dynasm!(ops
    ; .arch x86
    ; lea eax, [->hello]
    ; ->hello:
    ; .bytes "hello".as_bytes()
);

println!("{:?}", ops.labels().resolve_global("hello")); // Ok(AssemblyOffset(6))

ops.finalize().unwrap();

I expected [->hello] to be resolved as 6... what am I doing wrong?

Handing out executable function pointers

Currently there's no clear way to get pointers to executable data after encoding has been done. A clear API is needed that facilitates encoding any offsets, committing the result to executable memory, possibly updating this memory, recording offsets into this memory to later create callable/jumpable pointers with (or alternatively, an API to query the location of labels with from outside dynasm blocks).

Full program analysis

Currently, any features which span across dynasm blocks have to be evaluated at runtime as each macro invocation is treated seperately. If analysis of the entire program is used, the following features become possible:

Type maps
More efficient label data structures (currently hashmaps of strings are used, while these could just be arrays indexed using unique id's).

One problem with whole program analysis is that, when many different AssemblingBuffers are used, each buffer will allocate the memory necessary for all used identifiers.

omit known zero offset

Found that when writing code like ; mov QWORD [buffer + offset as i32], temp where offset is a immediate value this crate will unconditionally emit relative instructions. When explicitly checking for when offset is zero and emitting ; mov QWORD [buffer], temp reduces my emitted code significantly (~10%).
Maybe this crate could check if immediate values are zero and choose to emit smaller instructions.

Build failed with fresh rustc.

error[E0308]: mismatched types --> /data/work/dynasm-rs/plugin/src/lib.rs:83:46 | 83 | let token_tree = ecx.expander().fold_tts(token_tree); | ^^^^^^^^^^ expected struct syntax::tokenstream::TokenStream, found reference | = note: expected type syntax::tokenstream::TokenStreamfound type&[syntax::tokenstream::TokenTree]`

error[E0308]: mismatched types
--> /data/work/dynasm-rs/plugin/src/lib.rs:85:46
|
85 | let mut parser = ecx.new_parser_from_tts(&token_tree);
| ^^^^^^^^^^^ expected slice, found struct syntax::tokenstream::TokenStream
|
= note: expected type &[syntax::tokenstream::TokenTree]
found type &syntax::tokenstream::TokenStream

error: aborting due to 2 previous errors

error: Could not compile dynasm.`

rustc --version
rustc 1.17.0-nightly (6eb9960d3 2017-03-19)

Make the Relocation API more strict

Each implementer of DynasmLabelApi defines a Relocation type which should be used by the plugin to inform the runtime what kind of relocation is needed. Unfortunately, currently it is not possible to construct the associated type properly in the plugin code generation stage as is it only aware of the name that the assembler struct is bound to, and not the correct associated type. While future additions to rust might make this possible (in a typeof or decltype like manner), currently this means that the Relocation type is limited to being a simple type like a tuple of primitives, and that invalid values in these will cause an error at runtime.

Tutorial no longer works — API is outdated

The first error is that the Assembler class has been moved to x64::Assembler.

Now, I get errors like

error: no method named `global_label` found for type `dynasmrt::x64::Assembler` in the current scope
error: no method named `offset` found for type `dynasmrt::x64::Assembler` in the current scope

when trying to run the example code.

mov r64, imm64 seems to be unsupported

In 64-bit mode, one would expect the immediate argument to mov r64, ... to be imm64, but it appears to be imm32. There is apparently no efficient way to set a register to a 64-bit value. AFAICT Rust's asm! macro supports this fine.

The movabs extension is supported with a 64-bit immediate operand but only if the destination is rax.

Can't encode repne scasb

When trying to use instruction repne scasb I get the following error:

Cannot use prefix repne on this instruction

A quick glance at the code shows that it might be due to a typo here (REP used instead of ¿REPE?), though I am not sure that I follow what this does exactly

No constructor for `VecAssembler`

The VecAssembler is part of the public interface, but there doesn't seem to be a way to construct or obtain one.

Incorrect instruction encoding on aarch64

Hi!
I'm using dynasmrt to generate code at runtime and make a remote process execute it.
I actually need the conversion from text to bytes.

I couldn't find a way to make this simple snippet to work.

dynasm!(assembler
        ; .arch aarch64

        // CREATE DIR
        ; mov x8, 0x22
	; mov x0, 0
	; ldr x1, ->dir_name
	; mov x2, 0
	; svc 0

	// EXIT
        ; mov x8, 0x5d
	; mov x0, 0
	; svc 0

        ; ->dir_name:
        ; .bytes "/data/local/tmp/TEST_DIR\0".as_bytes()
);

I already tried this on a x86-64 machine (using the x86-64 instruction set) and it worked without problems.

gdb view:

That ldr instruction seems a bit off, shouldn't it be ldr x1 ...?

strace view:

mkdirat(0, "\1", 000)                   = -1 ENOTDIR (Not a directory)
exit(0)                                 = ?

Generated bytes:

480480D2
000080D2
18000058 // ldr x24, #0x18 - shouldn't it be ldr x1, #0x18 [81000058]?
020080D2
010000D4
A80B80D2
000080D2
010000D4
2F646174
612F6C6F
63616C2F
746D702F
54455354
5F444952

For instance, the same snippet

mov x8, 0x22
mov x0, 0   
ldr x1, =dir_name
mov x2, 0
svc 0

mov x8, 0x5d
mov x0, 0
svc 0

dir_name: 
.ascii "/data/local/tmp/TEST_DIR\0"

compiled and linked with as and ld (bundled with the NKD toolchain), works.

Am I generating the code correctly?

Type generic registers

In the documentation macros are used to create generic code, however macros tend to become unwieldy as the code inside grows and needs more information. It would be nice if there was a way to be able to reference a register in a way that is generic at the type level, or at least some way to reference a register from a variable.

Simple Example

fn popcnt<Reg: Register>(asm: Assembler) {
    dynasm!(asm
       ; popcnt Reg(5), [rsp]
    );
}

popcnt::<Rq>(asm);
popcnt::<Rd>(asm);

Changelog mapping project's version to supported rust's version

I'm using Rust nightly 1.27 and dynasm does not compile in it :(

rustc 1.27.0-nightly (7360d6dd6 2018-04-15)

Compiling dynasm results in:

   Compiling dynasm v0.1.4
error[E0609]: no field `identifier` on type `&syntax::ast::PathSegment`
   --> /home/bruno/.cargo/registry/src/github.com-1ecc6299db9ec823/dynasm-0.1.4/src/arch/x64/parser.rs:550:31
    |
550 |     Some(Ident {node: segment.identifier, span: path.span})
    |                               ^^^^^^^^^^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0609`.
error: Could not compile `dynasm`.

To learn more, run the command again with --verbose.

It would be nice to have a map of supported rust versions.

PS: I know the cause of this error is probably the instability of rustc API

setb isn't implemented

Currently, dynasm-rs supports setbe, seta, setae, setl, setle, setg, setge, sete and setne but not setb. There's no reason for it not to, it just fails if you try to emit it. As far as I can tell setc and setnae are precisely equivalent, but it still means that intent is harder to express than it should be. Other synonyms are supported, there's no reason for this one not to be.

Doesn't compile on nightly.

I get a bunch of errors saying:

method check is private.
missing field local_inner_macros in initializer of syntax::ext::base::SyntaxExtension.
no field parameters on type &syntax::ast::PathSegment.

I'm guessing something changed in syntax recently, but I might just be doing something stupid.

Edit: The README says that only "rustc 1.28.0-nightly (a1d4a9503 2018-05-20" is guaranteed to work, so I guess this is expected.

not compiling on nightly rustc 1.26.0-nightly (e5277c145 2018-03-28)

it is not compiling on 1.26.0-nightly (e5277c145 2018-03-28)
i fixed it like this. If it is ok i'll open a pull req.

~/d/plugin ❯❯❯ cargo build
    Updating registry `https://github.com/rust-lang/crates.io-index`
   Compiling stable_deref_trait v1.0.0                                          
   Compiling bitflags v0.9.1
   Compiling lazy_static v0.2.11
   Compiling owning_ref v0.3.3
   Compiling dynasm v0.1.3 (file:///home/edstefes/Projects/dynasm-rs/plugin)
error[E0063]: missing field `unstable_feature` in initializer of `syntax::ext::base::SyntaxExtension`
  --> src/lib.rs:44:35
   |
44 |                                   SyntaxExtension::NormalTT {
   |                                   ^^^^^^^^^^^^^^^^^^^^^^^^^ missing `unstable_feature`

error[E0023]: this pattern has 1 field, but the corresponding tuple variant has 2 fields
   --> src/arch/x64/parser.rs:380:9
    |
380 |         token::Ident(ast::Ident {ref name, ..}) if &*name.as_str() == kw => (),
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected 2 fields, found 1

error[E0023]: this pattern has 1 field, but the corresponding tuple variant has 2 fields
   --> src/arch/x64/parser.rs:388:12
    |
388 |     if let token::Ident(i) = parser.token {
    |            ^^^^^^^^^^^^^^^ expected 2 fields, found 1

error: aborting due to 3 previous errors

Some errors occurred: E0023, E0063.
For more information about an error, try `rustc --explain E0023`.
error: Could not compile `dynasm`.

To learn more, run the command again with --verbose.

Expanding macro's within dynasm blocks

This is currently blocked by advancements that still need to be made in rustc itself (primarily the impossibility to expand macros against token trees), which makes it impossible to expand macros before parsing. It should already be possible to expand macro's inside expressions though, but this will only allow them to expand into single arguments instead of whole chunks of asm.

Partial `mprotect` on executable memory

I'm trying to alter a small part (~60 bytes) of code in a ~10MB Assembler. It is pretty slow on AArch64, taking around 200 microseconds.

By digging into the implementation a bit I found that the underlying make_exec method always calls mprotect on the entire executable memory. I haven't looked into the kernel mprotect implementation, but I assume it will invalidate instruction caches & TLBs for the entire region too since this is required to enforce memory protections.

Is it possible to track modifications dynamically inside alter, and make mprotect changes on-demand?

Instruction cache not flushed after calling Assembler::alter on AArch64

It seems that Assembler::alter (which uses mprotect) does not flush the instruction cache after code has changed, and subsequent execution sometimes still uses the old code.

This issue can only be reproduced on a real AArch64 CPU (qemu-aarch64 on x86-64 machines works fine).

Adding explicit ic ivau instructions after calling .alter fixes this.

Consider using `MAP_JIT` on macOS

On macOS, the recommended way to implement a JIT system is by creating the memory map with PROT_WRITE | PROT_EXEC and the MAP_JIT flag, then using pthread_jit_write_protect_np to switch between writing and executing the buffer.

(This is kinda weird, because the W^X behavior is tracked on a per-thread basis, rather than per-region; I found it easiest to only enable W right before copying into the region, then disable it afterwards)

Anyways, it turns out that this is much faster than using mmap to swap regions from PROT_WRITE to PROT_EXEC!

Here's a flamegraph using mmap

(note the calls to mprotect and memmove taking up a good chunk of time)

Here's what it looks like with pthread_jit_write_protect_np

(those calls are gone, and pthread_jit_write_protect_np doesn't even show up)

I see one benchmark go from 112 ms down to 62 ms, almost a 50% improvement!

(My benchmarks are admittedly weird, in that they compile a lot of very small functions 😆)

This requires ditching / forking the memmap2 crate, which doesn't support this behavior. Here's how I did it.

Right now, it's easy for users to do this on their own: I'm using a VecAssembler then copying into this custom struct Mmap, which works fine. Still, this would be a decent optimization for the stock Assembler.

As always, the dynasm-rs is great, and I really appreciate the work that went into it!

More tests are required

Currently there are no real tests except from a general feature checking file.

The instruction encoding data in particular really needs to be checked.

The memmap crate is no longer maintained.

I know you are aware of this @CensoredUsername, however cargo audit has just recently started complaining about it:

Crate:         memmap
Version:       0.7.0
Warning:       unmaintained
Title:         memmap is unmaintained
Date:          2020-12-02
ID:            RUSTSEC-2020-0077
URL:           https://rustsec.org/advisories/RUSTSEC-2020-0077
Dependency tree: 
memmap 0.7.0
├── ykview 0.1.0
├── yktrace 0.1.0
│   ├── ykview 0.1.0
│   ├── ykrt 0.1.0
│   └── ykcompile 0.1.0
│       └── ykrt 0.1.0
└── dynasmrt 1.0.0
    └── ykcompile 0.1.0

I think you said you had another library in mind. Is it time to switch to it perhaps?

Please release another version

We're at the Rust All Hands and want to compile HolyJIT. This is super urgent. (Not.)

Website link in the repo's "About" section is wrong

Currently it points to https://censoredusername.github.com/dynasm-rs/language/index.html, but it should point to https://censoredusername.github.io/dynasm-rs/language/index.html (note the TLD) like it does in the README.

x86 support

Currently only long mode is supported, while it should be possible to also support protected mode with modifications to the encoding infrastructure

Panic when trying to use 0xff as immediate for a logical operation

trying to use
dynasm!(self ; and w0, w0, 255);
Will lead to
thread '<unnamed>' panicked at 'attempt to shift left with overflow', /home/pi/.cargo/registry/src/github.com-1ecc6299db9ec823/dynasmrt-1.2.0/src/aarch64.rs:208:27

(the element_size is 32)

0xff can be converted to Logical Immediate. It should lead to immr==0 and imms==7

Use smaller operand size PC-relative jumps where possible

I'm not deeply familiar with assembly, assemblers or compilers, but I believe that one optimisation that might be useful to do in dynasm (and that isn't really possible to do as a user of the library) is branch relaxation. Essentially, this means using the branch instructions that take a PC-relative offset of 8 bits or 16 bits when possible instead of always using pointer-sized absolute immediate jumps, with the reasoning that these are faster (and smaller?) for the majority-case of doing a same-function jump of only a few instructions. AFAICT this isn't done by dynasm right now, it just always does a static jump to an immediate of a fixed size, and then patches the value of the immediate only.

Of course, as I said I'm not too familiar with assemblers and so it's possible that these non-PC-relative jumps are always faster for non-PIE code and the 8-bit PC-relative jumps are only generated by LLVM etc. because they need to generate PIEs, whereas dynasm does not.

Feature request: compile-time resolution of "super-local" label

I noticed that hashmap lookups are taking a decent amount of JIT time when using local labels.

For example, if I manually compute jumps in this code:

dynasm!(ops
    // Basically the same as MinRegReg
    ; zip2 v4.s2, V(lhs_reg).s2, V(rhs_reg).s2
    ; zip1 v5.s2, V(rhs_reg).s2, V(lhs_reg).s2
    ; fcmgt v5.s2, v5.s2, v4.s2
    ; fmov x15, d5

    ; tst x15, #0x1_0000_0000
    ; b.ne >lhs

    ; tst x15, #0x1
    ; b.eq >both

    // LHS < RHS
    ; fmov D(out_reg), D(rhs_reg)
    ; mov w16, #CHOICE_RIGHT
    ; b >end

    // RHS < LHS
    ;lhs:
    ; fmov D(out_reg), D(lhs_reg)
    ; mov w16, #CHOICE_LEFT
    ; b >end

    ;both:
    ; fmax V(out_reg).s2, V(lhs_reg).s2, V(rhs_reg).s2
    ; mov w16, #CHOICE_BOTH

    ;end:
    ; strb w16, [x0], #1 // post-increment
)

I end up with something like this:

dynasm!(ops
    ; zip2 v4.s2, V(lhs_reg).s2, V(rhs_reg).s2
    ; zip1 v5.s2, V(rhs_reg).s2, V(lhs_reg).s2
    ; fcmgt v5.s2, v5.s2, v4.s2
    ; fmov x15, d5

    ; tst x15, #0x1_0000_0000
    ; b.ne #24 // -> lhs

    ; tst x15, #0x1
    ; b.eq #28 // -> both

    // LHS < RHS
    ; fmov D(out_reg), D(rhs_reg)
    ; mov w16, #CHOICE_RIGHT
    ; b #24 // -> end

    // <- lhs (when RHS < LHS)
    ; fmov D(out_reg), D(lhs_reg)
    ; mov w16, #CHOICE_LEFT
    ; b #12 // -> end

    // <- both
    ; fmax V(out_reg).s2, V(lhs_reg).s2, V(rhs_reg).s2
    ; mov w16, #CHOICE_BOTH

    // <- end
    ; strb w16, [x0], #1 // post-increment
)

In my codebase, this reduces the time spent in dynasm by about 30%, which is a decent chunk of performance!

It would be great to introduce a new flavor of label which is only valid during a single dynasm! block; the branch offset could then be computed at compile-time instead of runtime.

Cannot use RIP-relative addressing when an immediate is present.

This is currently due to a run-time limitation (it is assumed that a relocation is always at the end of an instruction) which is not true in case of RIP-relative addressing with an immediate.