quilt / etk Goto Github PK
View Code? Open in Web Editor NEWevm toolkit
License: Apache License 2.0
evm toolkit
License: Apache License 2.0
$ cat main.etk
push2 0x00
$ eas main.etk
thread 'main' panicked at 'source slice length (1) does not match destination slice length (2)', /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/slice/mod.rs:3058:13
This should be nicer.
Leaving myself a reminder to maybe restructure this into a tree the same way Expr
works.
Originally posted by @SamWilsn in #69 (comment)
I'll probably just start a list of things I run into that could have custom errors:
Add a couple constant functions to the assembly to make it easier to write. For example (signatures to be bike-shedded):
selector(...)
include(...)
selector(...)
selector(...)
is replaced by the function selector of the given solidity function signature.
push4 selector("updatePool(address)") ; actually pushes 0x7b46c54f
eq
push :label
jumpi
include(...)
include(...)
is replaced by the opcodes (and constants, etc) from the given path. Only valid outside of an opcode.
main.evm
push :hello
include("other.evm")
other.evm
jumpdest :hello
stop
push :hello
jumpdest :hello
stop
There should be a way to notify users that they've made this error:
push4 .label
pop
jumpdest .labl
I would like to be able to test extension opcodes, that are otherwise invalid, using the assembler.
Unfortunately the parser does not seem to provide any escape mechanism for directly embedding opcodes in the program.
For example, say that I am defining in my EVM impl semantics for the 0xb0
opcode,
I would like to be able to just assemble a program that says 0xb0
or ins(0xb0)
, but the parser fails me.
Sometimes it would be nice to compile things in based on a flag. For example, maybe you want overflow detection to be optional. Here a couple potential approaches:
%import_if(foo, "safemath.etk")
%import_if(!foo, "unsafemath.etk")
or
#if safe_overflow
...
#endif
I think I prefer the latter because I expect inline usage to be common for a couple quick ops.
Seems that we should come up with a template for macros.
At 0x44
DIFFICULTY
was replaced with PREVRANDAO
in EIP-4399. Though for etk I'd probably leave difficulty and add prevrandao as an alias to it.
Hi @SamWilsn and @lightclient .
This is not really an issue but it is the easiest way I found to be in contact with both of you at the same time.
I took the liberty to start a project to have a small ETK web site (including the playground that Sam asked for).
Many links are broken, or point to temporary places (install for example leads directly to github and should be replaced by a real guide) and even the design is quite "simple" (you can notice it in the playground that is just two painted textarea haha), but I thought it's good that you are aware of it and to have a place where we can discuss ideas.
By the way, I also made a logo for ETK :).
The temporary link to the site is this: https://etk-web.vercel.app/
I hope you like it and do not hesitate to ask for all the features you consider necessary.
P.S: If you have more resources (like https://github.com/quilt/vim-etk) or examples that you have written using ETK and can be added to the resources section, feel free to mention them here and I will add them as soon as I I write that section.
In #69 we added a random postfix to label names to create a scope. It would be nice to have better guarantees about the scope and not rely on the small possibility of label collisions.
A few ideas we had:
Ingest
to support expanding macros. This is not trivial because the Independent
scope is truly independent and needs no external data to fully assemble its ops. This is not the same assumption that instruction macros make, because they can use labels from the outer scope and when concretized they need to have an understanding of where they've been invoked in terms of concrete ops.Scope
enum in the assembler module and expand the Label
type to be a name
and scope
. This seems messy and it's annoying to have another Scope
type.Would be nice to have a flag for the disassembler that cross references PUSH4
immediates against 4byte.directory.
Currently eas
expects a file path as input, is there any reason why we cannot put the assembly directly?, this would be helpful when writing small examples:
$ eas --asm "caller push1 00 sstore stop"
3360005500
Adding release binaries and also nightly releases using GitHub actions with be great.
These binaries then can be used to create a GitHub action that would build repos with ETK code base.
I'd like to be able to write more nasm-like code, something like this:
.start
push1 1
pop
caller
jump .start
this would expand to:
jumpdest
pop
caller
push1 0
jump
%hex("...")
An instruction macro that inserts the argument literally into the assembled output:
jumpdest
%hex("7FAB")
invalid
Would become:
5b7fabfe
We already have %include_hex
, but this would be useful for smaller snippets.
%string("...")
An instruction macro that inserts the argument literally into the assembled output, encoded as UTF-8:
jumpdest
%string("hello world")
invalid
Would become:
5b68656c6c6f20776f726c64fe
This macro is useful for revert reasons.
The most simplified form of a complex expression currently looks like this:
//((1+2)*3-(4/2) = 7
let expr = Expression::Minus(
Expression::Times(
Expression::Plus(
Terminal::Number(1.into()).into(),
Terminal::Number(2.into()).into(),
).into(),
Terminal::Number(3.into()).into(),
).into(),
Expression::Divide(
Terminal::Number(4.into()).into(),
Terminal::Number(2.into()).into(),
).into(),
);
It would be great to write a macro so you can write it like this:
//((1+2)*3-(4/2) = 7
let expr = expr! {
Minus(
Times(
Plus(Number(1), Number(2)),
Number(3)
),
Divide(Number(4), Number(2))
)
};
Reproduce with:
disease --code 0x6142
Expected output:
0: push2 0x4200
Actual output: nothing
Tested with etk-dasm 0.2.1
I think these two test cases (in etk-asm/src/asm.rs
) should pass:
#[test]
fn assemble_instruction_macro_two_delayed_definitions_mirrored() -> Result<(), Error> {
let ops = vec![
AbstractOp::new(GetPc),
AbstractOp::Macro(InstructionMacroInvocation {
name: "macro1".into(),
parameters: vec![],
}),
AbstractOp::Macro(InstructionMacroInvocation {
name: "macro0".into(),
parameters: vec![],
}),
InstructionMacroDefinition {
name: "macro0".into(),
parameters: vec![],
contents: vec![
AbstractOp::new(JumpDest),
],
}
.into(),
InstructionMacroDefinition {
name: "macro1".into(),
parameters: vec![],
contents: vec![
AbstractOp::new(Caller),
],
}
.into(),
];
let mut asm = Assembler::new();
let sz = asm.push_all(ops)?;
assert_eq!(sz, 3);
let out = asm.take();
assert_eq!(out, hex!("58335b"));
Ok(())
}
#[test]
fn assemble_instruction_macro_two_delayed_definitions() -> Result<(), Error> {
let ops = vec![
AbstractOp::new(GetPc),
AbstractOp::Macro(InstructionMacroInvocation {
name: "macro0".into(),
parameters: vec![],
}),
AbstractOp::Macro(InstructionMacroInvocation {
name: "macro1".into(),
parameters: vec![],
}),
InstructionMacroDefinition {
name: "macro0".into(),
parameters: vec![],
contents: vec![
AbstractOp::new(JumpDest),
],
}
.into(),
InstructionMacroDefinition {
name: "macro1".into(),
parameters: vec![],
contents: vec![
AbstractOp::new(Caller),
],
}
.into(),
];
let mut asm = Assembler::new();
let sz = asm.push_all(ops)?;
assert_eq!(sz, 3);
let out = asm.take();
assert_eq!(out, hex!("585b33"));
Ok(())
}
Currently, if parsing fails in a child %import
the error only gives the line number.
foo.etk
%import("bar.etk")
bar.etk
push1 0x42
asdf
$ eas foo.etk
Error: parsing failed
Caused by: lexing failed
Caused by: --> 2:1
|
2 | asdf␊
| ^---
|
= expected EOI, op, push, local_macro, builtin, or label_definition
I think etk
will benefit from the ability to write simple one-liners, particularly in retesteth. They would look something like this:
push1 42; push1 13; sstore;
Since ;
is currently the comment marker, we'll begin using #
for comments.
It'd be nice to make it easier to insert constants into code while constructing a contract.
These are pretty much off the top of my head, so bear with me.
I think I'm partial to Solution 1 because it seems the most general / least opinionated, but Solution 2 probably has less footguns.
We modify %include
to expose the labels of the included file.
main.etk
foo:
push32 0x0000000000000000000000000000000000000000000000000000000000000000
bar:
push32 0x0000000000000000000000000000000000000000000000000000000000000000
ctor.etk
# Copy the runtime code.
%push(end-start)
dup1
%push(start)
%push(0)
codecopy
# Set the constants.
caller
%push(start.foo - start + 1)
mstore
caller
%push(start.bar - start + 1)
mstore
# Return the adjusted code.
%push(0)
return
start:
%include("main.etk")
end:
%const(...)
We add a new built-in instruction macro %const(n)
. It errors if compiled directly, but expands to a push32
with some sentinel value (0xdeadc0de...
) when include
d.
Then we modify %include
to take extra parameters. Each parameter is the name of a label pointing to where the constant should be written.
Then your initcode can reference those labels to write the constants:
main.etk
%const(0)
%const(1)
ctor.etk
# Copy the runtime code.
%push(end-start)
dup1
%push(start)
%push(0)
codecopy
# Set the constants.
caller
%push(foo - start)
mstore
caller
%push(bar - start)
mstore
# Return the adjusted code.
%push(0)
return
start:
%include("main.etk", foo, bar)
end:
Make CI fail if cargo check
or cargo clippy
produces warnings.
Don't just add #![deny(warnings)]
to the crates, because: https://github.com/rust-unofficial/patterns/blob/master/anti_patterns/deny-warnings.md
ecfg
currently only considers each basic block in isolation when determining possible jump targets. It can handle constants, arithmetic, and cases where z3 can prove only certain addresses are possible (ex. pop() * 0
will always be zero.) In all other cases, ecfg
is forced to assume that all jump targets are reachable.
For small handwritten programs, this naive approach is sufficient, but even the simplest solidity contract turns into an unreadable mess of paths.
The next step for ecfg
is to consider the program flow as a whole, and trace execution paths starting from offset zero, building a more complete picture. In other words, to further improve the control flow graph, the inputs to each block have to come from the preceding blocks.
I think we could support something like:
root:
push1 .nested
jump
.nested:
jumpdest
other:
.nested:
push1 root.nested
This example would define the following labels:
root
at 0
root.nested
at 3
other
at 4
other.nested
at 4
root.nested.super_nested
..nested
or super.nested
or something else entirely)https://www.tortall.net/projects/yasm/manual/html/nasm-local-label.html
include
becomes import
include_asm
becomes include
The output from disease
is not immediately importable into eas
, which is pretty annoying. I also have a weird function selector jump table implementation that requires precisely positioning instructions. I propose the following syntax to exactly specify an instruction's offset:
invalid
<a> jumpdest
pc
# Leading zeros are ignored.
<0d> gas
# Leading and trailing whitespace inside the brackets is ignored.
< f > caller
Gaps between instructions are filled with zeros, and instructions must still be written in increasing order. Labels refer to the next written instruction, and not the fill bytes. A label after a gap refers to the location after the last fill byte. A label before a gap without a subsequent instruction is an error.
The example above would assemble to:
fe0000000000000000005b58005a000033
Would be nice to be able to push a string, for revert messages and such.
Would be pretty sweet to be able write code that looks like this:
push4 (0x88 + 1) * 0o5
Would be even sweeter if we can write code like:
push :label - 33
Two new builtin macros:
%push(addr)
-> replaced by the smallest push opcode that can represent addr
.%jump(addr)
-> replaced by a push (as %push(addr)
) and a jump opcode.These macros are somewhat difficult because the size of label depends on the instructions preceding it.
$ ls
contract.etk
$ eas contract.etk out.hex
Error: Io { source: Os { code: 2, kind: NotFound, message: "No such file or directory" }, backtrace: Backtrace(()), path: Some("") }
$ eas ./contract.etk out.hex
$ ls
contractk.etk out.hex
I don't think I should need to prepend the target file with ./
.
ecfg
requires z3 to build, so that should be mentioned in the README and Book.
%macro bad()
%push(0x01)
%end
%bad()
$ eas test.etk
thread 'main' panicked at 'internal error: entered unreachable code', etk-asm/src/parse/mod.rs:53:14
stack backtrace:
0: rust_begin_unwind
at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/std/src/panicking.rs:498:5
1: core::panicking::panic_fmt
at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/panicking.rs:107:14
2: core::panicking::panic
at /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/panicking.rs:48:5
3: etk_asm::parse::parse_abstract_op
4: etk_asm::parse::macros::parse
5: etk_asm::parse::parse_abstract_op
6: etk_asm::parse::parse_asm
7: etk_asm::ingest::Ingest<W>::ingest_file
8: eas::main
I am trying to compose a contract as follows:
%push(body_end - body_begin)
dup1
%push(body_begin)
push1 0x00
codecopy
push1 0x00
return
body_begin:
%include("A.eas")
%include("B.eas")
...
body_end:
Unfortunately the assembly is broken, as eas generates illegal jumps in the body.
If I lump all the includes together and do a single file, everything works fine.
So this is a bug in the assembly.
I like how we've been using the rust-style github labels for the spec work - should we also use them here? Is there a more correct term for that style of organization?
It would be ideal to support different data formats for push
immediates:
12345
0b10101
0o77771
0x555
Pretty self explanatory: have some sort of syntax to declare a constant value, and then be able to refer to that value elsewhere in the source.
Min example:
%macro revert_if_neq()
push1 revert # [revert, a != b]
%end
%revert_if_neq()
%revert()
Note: revert is used as macro and as label but is not defined.
Panic:
thread 'main' panicked at 'assertion failed: `(left == right)`
left: `Some(2)`,
right: `Some(0)`', /home/jochem/.cargo/registry/src/github.com-1ecc6299db9ec823/etk-asm-0.2.1/src/asm.rs:691:17
stack backtrace:
0: rust_begin_unwind
at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/std/src/panicking.rs:575:5
1: core::panicking::panic_fmt
at /rustc/90743e7298aca107ddaa0c202a4d3604e29bfeb6/library/core/src/panicking.rs:65:14
2: core::panicking::assert_failed_inner
3: core::panicking::assert_failed
4: etk_asm::asm::Assembler::expand_macro
5: etk_asm::asm::Assembler::push
6: etk_asm::ingest::SourceStack<W>::write
7: etk_asm::ingest::Ingest<W>::ingest_file
8: eas::main
A macro, name to be bikeshedded, that inserts swap
, dup
, and pop
to rearrange the stack:
%stack(
"a b c",
"b b c a"
)
With London out, we should add support.
I noticed that ecfg fails on most of the nontrivial contracts that I tried.
Here is an example, contact. AFAIR is just trivial "get uint from slot - put uint to slot" contract compiled with solidity (opt enabled).
0x6080604052348015600f57600080fd5b506004361060325760003560e01c8063b2010978146037578063cfae3217146049575b600080fd5b60476042366004605e565b600055565b005b60005460405190815260200160405180910390f35b600060208284031215606e578081fd5b503591905056fea2646970667358221220158feba571c05db2dfbfcf6d4bfd06d8ff6d697ef52c8e1fbba805a33a17720764736f6c63430008040033
$ RUST_BACKTRACE=1 ./target/release/ecfg -x code.txt
thread 'main' panicked at 'assertion failed: `(left == right)`
left: `4`,
right: `0`', etk-dasm/src/blocks/annotated.rs:172:13
stack backtrace:
0: rust_begin_unwind
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at /rustc/7737e0b5c4103216d6fd8cf941b7ab9bdbaace7c/library/core/src/panicking.rs:143:14
2: core::panicking::assert_failed_inner
3: core::panicking::assert_failed
4: etk_dasm::blocks::annotated::AnnotatedBlock::annotate
5: etk_analyze::cfg::ControlFlowGraph::new
6: ecfg::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Maybe this could be useful for anyone debugging this. Here is a "naive cfg" of this contract. Each block is symbolically executed on its own: if jump location is static—nodes are connected on the graph, if jump location comes from the stack—I draw (borrowed n)
that means that this block ends with a jump to stack[-n]
value at the beginning of the block, where stack[-1]
is top.
I wonder if panics occurs precisely because of this kind of cfg nodes that jump to non-static locations—precisely nodes "6e" and "42".
Trying disease --hex-file my_hex_file.hex
where the file contains 0x6000
takes awhile on my machine.
I have isolated it to: https://github.com/quilt/etk/blob/master/etk-4byte/src/lib.rs#L44-L52 .
It would be really helpful to have a macro that builds the opcode matching structure since there is a lot of duplicated code.
Similar to the solidity feature:
uint256 foo = uint256(-1);
With the eventual goal of writing a Remix plugin, we should look into how much effort it would be to support WASM, at least in etk-asm
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.