Git Product home page Git Product logo

tenderjit's Introduction

TenderJIT

TenderJIT is an experimental JIT compiler for Ruby written in Ruby. Its design is mostly based off YJIT.

Getting Started with TenderJIT

TenderJIT isn't available as a gem (yet). To start using it, clone the repository and run the following commands:

$ bundle install
$ bundle exec rake test

If the tests pass, then you're ready to go!

TenderJIT currently requires Ruby 3.0.2 or the edge version of Ruby. It may work on 3.0.X, but I haven't tested older versions.

Running JIT code

Right now, TenderJIT doesn't automatically compile methods. You must manually tell TenderJIT to compile a method.

Let's look at an example:

require "tenderjit"

def fib n
  if n < 3
    1
  else
    fib(n - 1) + fib(n - 2)
  end
end

jit = TenderJIT.new
jit.compile(method(:fib)) # Compile the `fib` method

# Run the `fib` method with the JIT enabled
jit.enable!
fib 8
jit.disable!

Eventually TenderJIT will compile code automatically, but today it doesn't.

TenderJIT only supports Ruby 3.0.2 and up!

How does TenderJIT work?

TenderJIT reads each YARV instruction in the target method, then converts that instruction to machine code.

Let's look at an example of this in action. Say we have a function like this:

def add a, b
  a + b
end

If we disassemble the method using RubyVM::InstructionSequence, we can see the instructions that YARV uses to implement the add method:

$ cat x.rb
def add a, b
  a + b
end

$ ruby --dump=insns x.rb
== disasm: #<ISeq:<main>@x.rb:1 (1,0)-(3,3)> (catch: FALSE)
0000 definemethod                           :add, add                 (   1)[Li]
0003 putobject                              :add
0005 leave

== disasm: #<ISeq:[email protected]:1 (1,0)-(3,3)> (catch: FALSE)
local table (size: 2, argc: 2 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 2] a@0<Arg>   [ 1] b@1<Arg>
0000 getlocal_WC_0                          a@0                       (   2)[LiCa]
0002 getlocal_WC_0                          b@1
0004 opt_plus                               <calldata!mid:+, argc:1, ARGS_SIMPLE>[CcCr]
0006 leave                                                            (   3)[Re]

The add method calls 4 instructions, 3 of them are unique:

  • getlocal_WC_0
  • opt_plus
  • leave

The YARV virtual machine works by pushing and popping values on a stack. The first two calls to getlocal_WC_0 take one parameter, 0, and 1 respectively. This means "get the local at index 0 and push it on the stack", and "get the local at index 1 and push it on the stack".

After these two instructions have executed, the stack should have two values on it. The opt_plus instructions pops two values from the stack, adds them, then pushes the summed value on the stack. This leaves 1 value on the stack.

Finally the leave instruction pops one value from the stack and returns that value to the calling method.

TenderJIT works by examining each of these instructions, then converts them to machine code at runtime. If a machine code version of the method is available at run-time, then YARV will call the machine code version rather than the YARV byte code version.

Hacking on TenderJIT

You should only need Ruby 3.0.0 or up to get started hacking on TenderJIT. However, I highly recommend installing a debugger like lldb or gdb as well.

The main compiler object is the TenderJIT::ISEQCompiler class which can be found in lib/tenderjit/iseq_compiler.rb.

Each instruction sequence object (method, block, etc) gets its own instance of an ISEQCompiler object.

Each YARV instruction has a corresponding handle_* method in the ISEQCompiler class. The example above used getlocal_WC_0, opt_plus, and leave. Each of these instructions have corresponding handle_getlocal_WC_0, handle_opt_plus, and handle_leave methods in the ISEQCompiler class.

When a request is made to compile an instruction sequence (iseq), the compiler checks to see if there is already an ISEQCompiler object associated with the iseq. If not, it allocates one, then calls compile on the object.

The compiler will compile as many instructions in a row as it can, then will quit compilation. Depending on the instructions that were compiled, it may resume later on.

Not all instructions have corresponding handle_* methods. This just means they are not implemented yet! If you find an instruction you'd like to implement, please do it!

When no corresponding handler function is found, the compiler will generate an "exit" and the machine code will pass control back to YARV. YARV will resume where the compiler left off, so even partially compiled instruction sequences will work.

YARV has a few data structures that you need to be aware of when hacking on TenderJIT. First is the "control frame pointer" or CFP. The CFP represents a stack frame. Each time we call a method, an new stack frame is created.

The CFP points to the iseq it's executing. It also points to the Program Counter, or PC. The PC indicates which instruction is going to execute next. The other crucial thing the CFP points to is the Stack Pointer, or SP. The SP indicates where the top of the stack is, and it points at the "next empty slot" in the stack.

When a function is called, a new CFP is created. The CFP is initialized with the first instruction in the iseq set as the PC, and an empty slot in the SP. When getlocal_WC_0 executes, first it advances the PC to point at the next instruction. Then getlocal_WC_0 fetches the local value, writes it to the empty SP slot, then pushes the SP slot up by one.

TenderJIT gains speed by eliminating PC and SP advancement. This means that as TenderJIT machine code executes, the values on the CFP may not reflect reality! In order to hand control back to YARV, TenderJIT must write accurate values back to the CFP before returning control.

Lazy compilation

TenderJIT is a lazy compiler. It (very poorly) implements a version of Lazy Basic Block Versioning. TenderJIT will only compile one basic block at a time. This means that TenderJIT will stop compiling any time it finds an instruction that might jump somewhere else.

For example:

def add a, b
  puts "hi"

  if a > 0
    b - a
  else
    a + b
  end
end

TenderJIT will compile the method calls as well as the comparison, but when it sees there is a conditional, it will stop compiling. At that point, it inserts a "stub" which is just a way to resume compilation at that point. These "stubs" call back in to the compiler and ask it to resume compilation from that point.

Runtime compilation methods start with compile_* rather than handle_*.

As a practical example, lets look at how the compiler handles the following code:

def get_a_const
  Foo
end

The instructions for this method are as follows:

== disasm: #<ISeq:[email protected]:1 (1,0)-(3,3)> (catch: FALSE)
0000 opt_getinlinecache                     9, <is:0>                 (   2)[LiCa]
0003 putobject                              true
0005 getconstant                            :Foo
0007 opt_setinlinecache                     <is:0>
0009 leave                                                            (   3)[Re]

If we check the implementation of opt_getinlinecache in YARV, we see that it will check a cache. If the cache is valid it will jump to the destination instruction, in this case the instruction at position 9 (you can see that 9 is a parameter on the right of opt_getinlinecache). Since this function can jump, we consider it the end of a basic block. At compile time, TenderJIT doesn't know the machine address where it would have to jump. So it inserts a "stub" which calls the method compile_opt_getinlinecache, but at runtime rather than compile time.

The runtime function will examine the cache. If the cache is valid, it patches the calling jump instruction in the generated machine code to just jump to the destination.

The next time the machine code is run, it no longer calls in to the lazy compile method, but jumps directly where it needs to go.

Why TenderJIT?

I built this JIT for several reasons. The first, main reason, is that I'm helping to build a more production ready actually-fast-and-good JIT at work called YJIT. I was not confident in my skills to build a JIT whatsoever, so I wanted to try my hand at building one, but in pure Ruby.

The second reason is that I wanted to see if it was possible to write a JIT for Ruby in pure Ruby (apparently it is).

My ultimate goal is to be able to ship a gem, and people can just require the gem and their code is suddenly faster.

I picked the name "TenderJIT" because I thought it was silly. If this project can become a serious JIT contender then I'll probably consider renaming it to something that sounds more serious like "SeriousJIT" or "AdequateCodeGenerator".

How can I help?

If you'd like a low friction way to mess around with a JIT compiler, please help contribute!

You can contribute by adding missing instructions or adding tests, or whatever you want to do!

Lots of TenderJIT internals just look like x86-64 assembly, and I'd like to get away from that. So I've been working on a DSL to hide the assembly language away from developers. I need help developing that and converting the existing "assembly-like" code to use the runtime class.

You can find the DSL in lib/tenderjit/runtime.rb.

Thanks for reading! If you want to help out, please ping me on Twitter or open an issue!

tenderjit's People

Contributors

64kramsystem avatar alissonbrunosa avatar dblock avatar edipofederle avatar eileencodes avatar iancanderson avatar mohsen-alizadeh avatar nvasilevski avatar tenderlove avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tenderjit's Issues

Test suite occasionally fails, with a segfault at 0x0

I've noticed, while running the test suite for a large amount of sequential runs, that TJ sometimes segfaults.

The failure is very nondeterministic. Sometimes it takes a few runs, sometimes it doesn't happen in a hundred runs. It does not depend on the test seed.

Since my last two PRs were quite sensitive, I've checked if the issue was present before they were merged, and I can confirm that the issue was already present.

Below there are some sample failures; I think the only pointers (haha) they give is that, since the segfault address is 0, this should be either a null pointer, or I think more likely, a misaligned stack.

A very long run on Mac may confirm if this is Linux-only, or cross-platform. But unfortunately, since even 100 runs don't guarantee a failure, it may hard to reproduce.

Sample 1:

......S...........SS......SSS.S....S...S.SS....S.S..S../home/saverio/code/fisk-dev/lib/fisk.rb:847: [BUG] Segmentation fault at 0x0000000000000000
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0040 p:---- s:0223 e:000222 CFUNC  :zip
c:0039 p:0023 s:0218 e:000217 BLOCK  /home/saverio/code/fisk-dev/lib/fisk.rb:847 [FINISH]
c:0038 p:---- s:0214 e:000213 IFUNC 
c:0037 p:---- s:0211 e:000210 CFUNC  :each
c:0036 p:---- s:0208 e:000207 CFUNC  :find_all
c:0035 p:0007 s:0204 e:000203 METHOD /home/saverio/code/fisk-dev/lib/fisk.rb:845
c:0034 p:0022 s:0192 e:000191 METHOD /home/saverio/code/fisk-dev/lib/fisk/instructions.rb:1400
c:0033 p:0014 s:0187 e:000186 METHOD /home/saverio/code/tenderjit-dev/lib/tenderjit/runtime.rb:198
c:0032 p:0049 s:0181 e:000180 BLOCK  /home/saverio/code/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:2312
c:0031 p:0038 s:0177 e:000176 METHOD /home/saverio/code/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:3001
c:0030 p:0011 s:0171 e:000170 METHOD /home/saverio/code/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:2307
c:0029 p:0360 s:0166 e:000165 METHOD /home/saverio/code/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:130
c:0028 p:0137 s:0154 e:000153 METHOD /home/saverio/code/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:84
c:0027 p:0148 s:0149 e:000148 METHOD /home/saverio/code/tenderjit-dev/lib/tenderjit.rb:558
c:0026 p:0009 s:0140 e:000139 METHOD /home/saverio/code/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:557 [FINISH]
Segmentation fault

Sample 2:

.S..S.............S.SS.S.S....SSS...SS...................SS.........SS......./home/saverio/code/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:45: [BUG] Segmentation fault at 0x0000000000000000
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0029 p:0294 s:0163 e:000162 METHOD /home/saverio/code/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:45 [FINISH]
c:0028 p:---- s:0155 e:000154 CFUNC  :new
c:0027 p:0125 s:0149 e:000148 METHOD /home/saverio/code/tenderjit-dev/lib/tenderjit.rb:552
c:0026 p:0009 s:0140 e:000139 METHOD /home/saverio/code/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:557 [FINISH]
Segmentation fault

Fisk error when running the test suite, on a Clang-compiled Ruby

I've compiled Ruby with Clang, in order to investigate the two todos in ruby_internals.rb, however, I experience unexpected failures in the test suite.

Specifically, I get 16 failures like the following:

  1) Error:
TenderJIT::MethodRecursion#test_fib:
NotImplementedError: Couldn't find instruction CMP imm32, m64
Valid forms:
    CMP al, imm8
    CMP r8, imm8
    CMP r8, r8
    CMP r8, m8
    CMP ax, imm16
    CMP r16, imm8
    CMP r16, imm16
    CMP r16, r16
    CMP r16, m16
    CMP eax, imm32
    CMP r32, imm8
    CMP r32, imm32
    CMP r32, r32
    CMP r32, m32
    CMP rax, imm32
    CMP r64, imm8
    CMP r64, imm32
    CMP r64, r64
    CMP r64, m64
    CMP m8, imm8
    CMP m8, r8
    CMP m16, imm8
    CMP m16, imm16
    CMP m16, r16
    CMP m32, imm8
    CMP m32, imm32
    CMP m32, r32
    CMP m64, imm8
    CMP m64, imm32
    CMP m64, r64

    /path/to/fisk-e3d21ac5df10/lib/fisk.rb:829:in `gen_with_insn'
    /path/to/fisk-e3d21ac5df10/lib/fisk/instructions.rb:1643:in `cmp'
    /path/to/tenderjit-dev/lib/tenderjit/runtime.rb:270:in `block (2 levels) in if_eq'
    /path/to/tenderjit-dev/lib/tenderjit/runtime.rb:423:in `maybe_reg'
    /path/to/tenderjit-dev/lib/tenderjit/runtime.rb:269:in `block in if_eq'
    /path/to/tenderjit-dev/lib/tenderjit/runtime.rb:423:in `maybe_reg'
    /path/to/tenderjit-dev/lib/tenderjit/runtime.rb:268:in `if_eq'
    /path/to/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:1085:in `block (2 levels) in compile_opt_send_without_block'
    /path/to/tenderjit-dev/lib/tenderjit/runtime.rb:284:in `else'
    /path/to/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:1078:in `block in compile_opt_send_without_block'
    /path/to/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:1831:in `with_runtime'
    /path/to/tenderjit-dev/lib/tenderjit/iseq_compiler.rb:1041:in `compile_opt_send_without_block'
    /path/to/tenderjit-dev/test/tenderjit_test.rb:33:in `fib'
    /path/to/tenderjit-dev/test/tenderjit_test.rb:47:in `test_fib'

Ruby configuration:

$ ruby -rrbconfig -e'puts File.join RbConfig::CONFIG["prefix"], "lib", RbConfig::CONFIG["LIBRUBY"]'
/path/to/lib/libruby.so.3.0.2

$ ruby -rrbconfig -e'p RbConfig::CONFIG["CFLAGS"]'
"-O3 -ggdb3 -Wall -Wextra -Wdeprecated-declarations -Wdivision-by-zero -Wimplicit-function-declaration -Wimplicit-int -Wmisleading-indentation -Wpointer-arith -Wshorten-64-to-32 -Wwrite-strings -Wmissing-noreturn -Wno-constant-logical-operand -Wno-long-long -Wno-missing-field-initializers -Wno-overlength-strings -Wno-parentheses-equality -Wno-self-assign -Wno-tautological-compare -Wno-unused-parameter -Wno-unused-value -Wunused-variable -Wextra-tokens  -fPIC"

$ ruby -rrbconfig -e'p RbConfig::CONFIG["CC"]'
"clang -fdeclspec"

bundle install fails since worf repository's default branch is main

$ bundle install
Fetching https://github.com/ruby/fiddle.git
Fetching https://github.com/tenderlove/worf.git
fatal: Needed a single revision
Revision master does not exist in the repository https://github.com/tenderlove/worf.git. Maybe you misspelled it?`

need to specify the main branch in gemfile

The JIT may be generating no-op `mov`s

(I may be wrong on this, as I still don't have clear how register allocation works)

While trying some manual register allocation (specifically, directly popping a value into RDI (rt.pop_reg @fisk.rdi), so that it could be passed as first parameter of a cfunc (rt.call_cfunc_without_alignment <addr>, [@fisk.rdi, ...])), I was curious if TJ/FISK would avoid generating instructions for copying the operand to RDI.

Specifically, I was looking around here:

    def call_cfunc_without_alignment func_loc, params
      raise NotImplementedError, "too many parameters" if params.length > 6
      raise "No function location" unless func_loc > 0

      params.each_with_index do |param, i|
        case param
        when Integer
          @fisk.mov(Fisk::Registers::CALLER_SAVED[i], @fisk.uimm(param))
        when Fisk::Operand
          @fisk.mov(Fisk::Registers::CALLER_SAVED[i], param)
        when TemporaryVariable
          @fisk.mov(Fisk::Registers::CALLER_SAVED[i], param.to_register)
        else
          raise NotImplementedError
        end
      end
      @fisk.mov(@fisk.rax, @fisk.uimm(func_loc))
        .call(@fisk.rax)
      @fisk.rax
    end

Now, I couldn't actually find any mov rdi, rdi, however, I found that other no-op movs are generated (at least, I believe), specifically, a bunch of mov r9, r9.

I detect this by hacking fisk.rb -> write_to() with this (horrendous) code:

    while insn = instructions.shift
      z_insn_name = insn.instance_variable_get(:@insn).name rescue nil
      z_op0_reg_name = insn.instance_variable_get(:@operands)[0].register.name rescue nil
      z_op1_reg_name = insn.instance_variable_get(:@operands)[1].register.register.name rescue :phony # avoid nil == nil

      if z_insn_name == 'MOV' && z_op0_reg_name == z_op1_reg_name
        puts "#{z_insn_name} #{z_op0_reg_name}, #{z_op1_reg_name}"
      end

by running rake test you'll see many occurrences. Am I correct, or am I missing something?

I dug a little bit in Fisk, and the mov ultimately ends here:

in /path/to/fisk/instructions/mov.rb
   417:         Class.new(Fisk::Encoding) {
   418:           def encode buffer, operands
   419:             add_rex(buffer, operands,
   420:               true,
   421:               1,
   422:               operands[0].rex_value,
   423:               operands[1].rex_value,
   424:               operands[1].rex_value) +
   425:             add_opcode(buffer, 0x8B, 0) +
   426:             add_modrm_reg_mem(buffer,
   427:               0,
   428:               operands[0].op_value,
   429:               operands[1].op_value, operands) +
   430:             0
   431:           end

which seems to me, that it's encoding as mov r9, r9.

ps. On Error Resume Next FTW

`tostring` instruction fails on complex examples

Current implementation of tostring seems correct

def handle_tostring
rb_obj_as_string_result = Fiddle::Handle::DEFAULT["rb_obj_as_string_result"]
str = @temp_stack.pop
val = @temp_stack.pop
with_runtime do |rt|
rt.call_cfunc rb_obj_as_string_result, [str, val]
rt.push rt.return_value, name: RUBY_T_STRING
end
end

However the following test fails

    def complex_tostring
      "#{1234}#{5678}"
    end

    def test_complex_tostring
      meth = method(:complex_tostring)

      assert_has_insn meth, insn: :tostring

      jit.compile(meth)
      jit.enable!
      v = meth.call
      jit.disable!

      assert_equal 1, jit.compiled_methods
      assert_equal 0, jit.exits
      assert_equal "12345678", v
    end

with

--- expected
+++ actual
@@ -1 +1,3 @@
-"12345678"
+# encoding: US-ASCII
+#    valid: true
+"123456785678"

Instructions:

[ruby-3.0.2p107] ruby --dump=insns -e "\"#{1234}#{5678}\""
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,16)> (catch: FALSE)
0000 putobject                              ""                        (   1)[Li]
0002 putobject                              1234
0004 dup
0005 opt_send_without_block                 <calldata!mid:to_s, argc:0, FCALL|ARGS_SIMPLE>
0007 tostring
0008 putobject                              5678
0010 dup
0011 opt_send_without_block                 <calldata!mid:to_s, argc:0, FCALL|ARGS_SIMPLE>
0013 tostring
0014 concatstrings                          3
0016 leave

I'll try to poke around by myself, but the issue seems a bit complicated.
My first guess is that something is messing up the stack so the second tostring instruction is popping wrong values. Presumably opt_send_without_block may be the cause as it seems to be the most complex piece haha

Error `fetch': key not found: nil (KeyError)

Hi,

I'm exploring this project, and just trying to run the tests with: rake test, and I have it:

/Users/edipo/workspace/tenderjit/misc/ruby_internals.rb:54:in `fetch': key not found: nil (KeyError)
	from /Users/edipo/workspace/tenderjit/misc/ruby_internals.rb:54:in `read_instruction_op_types'
	from /Users/edipo/workspace/tenderjit/misc/ruby_internals.rb:121:in `process'
	from /Users/edipo/workspace/tenderjit/misc/ruby_internals.rb:369:in `get_internals'
	from misc/build-ruby-internals.rb:431:in `<main>'
rake aborted!
Command failed with status (1): [/Users/edipo/.rvm/rubies/ruby-3.0.0/bin/ru...]
/Users/edipo/workspace/tenderjit/Rakefile:43:in `block in <top (required)>'
/Users/edipo/.rvm/gems/ruby-3.0.0/gems/rake-13.0.6/exe/rake:27:in `<top (required)>'
/Users/edipo/.rvm/gems/ruby-3.0.0/bin/ruby_executable_hooks:22:in `eval'
/Users/edipo/.rvm/gems/ruby-3.0.0/bin/ruby_executable_hooks:22:in `<main>'
Tasks: TOP => test => compile => lib/tenderjit/ruby/edeb1/constants.rb
(See full trace by running task with --trace)

Looking in the code Looks like the regex doesn't match anything in the symbol_addresses keys. We are trying to match keys like "x.", this is right? While debugging I see keys like rb_locale_charmap_index. rb_ary_to_ary_m, etc.

Maybe is a very small thing I'm missing here, so appreciate your time helping me.

System details:

  • ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-darwin20]

Thx

Getting "cannot load such file -- tenderjit/ruby/e6c9f/structs"

Successfully installed the gem, but getting the following when I issue "require 'tenderjit'"

irb 3.0.2 :001 > require 'tenderjit' <internal:/home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in require': cannot load such file -- tenderjit/ruby/e6c9f/structs (LoadError)
from internal:/home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb:85:in require' from /home/jenko/.rvm/gems/ruby-3.0.2/gems/tenderjit-1.0.0/lib/tenderjit/ruby.rb:8:in <top (required)>'
from internal:/home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb:85:in require' from <internal:/home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in require'
from /home/jenko/.rvm/gems/ruby-3.0.2/gems/tenderjit-1.0.0/lib/tenderjit.rb:3:in <top (required)>' from <internal:/home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:160:in require'
from internal:/home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb:160:in rescue in require' from <internal:/home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:149:in require'
from (irb):1:in <main>' from /home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/gems/3.0.0/gems/irb-1.3.5/exe/irb:11:in <top (required)>'
from /home/jenko/.rvm/rubies/ruby-3.0.2/bin/irb:23:in load' from /home/jenko/.rvm/rubies/ruby-3.0.2/bin/irb:23:in

'
internal:/home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb:85:in require': cannot load such file -- tenderjit (LoadError) from <internal:/home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in require'
from (irb):1:in <main>' from /home/jenko/.rvm/rubies/ruby-3.0.2/lib/ruby/gems/3.0.0/gems/irb-1.3.5/exe/irb:11:in <top (required)>'
from /home/jenko/.rvm/rubies/ruby-3.0.2/bin/irb:23:in load' from /home/jenko/.rvm/rubies/ruby-3.0.2/bin/irb:23:in '
`

[Good first issue] Cover `setlocal_WC_1` instruction with tests

It's a good issue to start contributing!
setlocal_WC_1 has been implemented but we are missing a test:

def handle_setlocal_WC_1 idx

def test_setlocal_WC_1
skip "setlocal_WC_1 has been implemented. Please add a test."
end

Here is an example of the code that supposed to generate adjuststack instruction:
https://github.com/ruby/ruby/blob/9eae8cdefba61e9e51feb30a4b98525593169666/bootstraptest/test_insns.rb#L25

Temporary Stack allows negative values

In #55 we found a bug where an offset was passed to the temporary stack that ended up calculating a negative offset in to the underlying array.

  1. When we do calculations like this, should we consider it a bug if the result of the calculation is < 0?
  2. If so, we should raise an exception.

Runtime#call_cfunc etc should be smarter

Right now we're using R9 as a scratch register, but R9 is a caller saved register meaning that the callee is free to clobber the value in R9. If we want to save the value in R9 then we have to push the register before calling the C function.

However, our register allocator has us check out and check back in temp registers meaning that we should be able to know that the "lifespan" of a temp register spans a C function call.

We should make the runtime smart enough to detect that R9 (or any other caller saved register) "lives" past a C function call and have it automatically push / pop the register.

The Rakefile fingerprinting is not unique across different Ruby compiles

In order to know when to (re)generate the Ruby internals data (lib/tenderjit/ruby), the Rakefile uses a (shortened) hash of RUBY_DESCRIPTION.

When switching between different compilers though (or different compiling options, e.g. adding debug symbols), the internals data changes, while the RUBY_DESCRIPTION doesn't.

While switching builds is not very common, there are some use cases (different optimization levels, different compilers...) that it would be useful to cover.

There are different ways of identifying Ruby, I think fingerprinting the binary itself IO.read(Gem.ruby) should do the job precisely and simply. If that's a good enough solution, I don't think there's any need to go to great lengths with File.expand_path, as that's a far-fetched case (and false positives would just trigger a rebuild).

I can take care of this, as there are only a couple of places to change.

Easy/ish missing instructions

A (sub)list of the possibly easy/ish missing instructions:

  • intern: 64kramsystem (#87)
  • swap: nvasilevski (#64)
  • bitblt: easter egg for the masses!
  • answer: The Answer to Life, the Universe, and Everything
  • branchnil: difficult, but should be easy to copypasta from branchif
  • setn: unsure if easy
  • opt_and: unsure if easy
  • opt_or: unsure if easy
  • opt_not: unsure if easy
  • opt_str_freeze: unsure if easy
  • opt_nil_p: unsure if easy
  • opt_str_uminus: unsure if easy

Notes about others:

  • newarraykwsplat: not easy but seemingly feasible; requires implementing (or inlining) many small functions
  • expandarray: not easy
  • topn: implemented live

If anybody wants to take any, put a note/comment so that we accidentally don't work on the same one!

Undefined method 'header' for nil:NilClass

I tried to install and run TenderJIT but it won't start with following error upon rake compile:

/usr/bin/ruby -I lib misc/build-ruby-internals.rb 65822 /root/GitHub/tenderjit/misc/ruby_internals.rb:310:in read_dwarf': undefined method header' for nil:NilClass (NoMethodError) from /root/GitHub/tenderjit/misc/ruby_internals.rb:321:in block in each_compile_unit'
from /root/GitHub/tenderjit/misc/ruby_internals.rb:318:in open' from /root/GitHub/tenderjit/misc/ruby_internals.rb:318:in each_compile_unit'
from /root/GitHub/tenderjit/misc/ruby_internals.rb:114:in process' from /root/GitHub/tenderjit/misc/ruby_internals.rb:381:in get_internals'
from misc/build-ruby-internals.rb:435:in <main>' rake aborted!

When commenting that line out, the same error occurs for line 276.
I'd really like to try this project out, so any help is appreciated!

Handle Gemfile.lock

Right now, the Gemfile.lock is neither in the SCM, nor in the .gitignore file, so it's somewhat easy to accidentlly add it (as it's required, for example when running the test suite).

Due to the library nature of the project, I think that it should be put in the .gitignore. As alternative, devs can add it to the local repository ignore list, but I think the former option makes more sense.

I don't have experience with a large number of projects though, so feel free to close straight away if this doesn't make sense ๐Ÿ˜

Question: Is there any way to cleany generate bytecode?

Is there any way to cleanly generate bytecode (along with the required data structures, e.g. local table etc)?

Currently, the unit testing of the instructions is not unit testing in a strict sense - it's actually more functional (IMO), as each unit tests performs more work than just just creating and executing an instruction.

If bytecode could be cleanly/easily generated, there would be several benefits:

  • tests would be targeted and evident; right now, the actual instructions are hidden behind an abstraction
  • tested behavior would actually be guaranteed; due to the abstraction, it's technically possible that the compiler, at some point, compiles to different instructions, causing the test to actually test something different
  • debugging could be easier, due to the lower amount of instructions.

On the other hand, the instructions testings works fine as it is; this is more of a generic discussion :)

Also, what do you think of enabling discussions on this repository? It would allow users to make generic questions (e.g. "which resources to study on ๐Ÿ˜ฌ") separate from issues in a strict sense.

That's my 2 bits, anyway ๐Ÿ˜†

[Good first issue] Cover `adjuststack` instruction with tests

It's a good issue to start contributing!
adjuststack has been implemented but we are missing a test:

def handle_adjuststack n
n.times { @temp_stack.pop }
end

def test_adjuststack
skip "adjuststack has been implemented. Please add a test."
end

Here is an example of the code that supposed to generate adjuststack instruction:
https://github.com/ruby/ruby/blob/9eae8cdefba61e9e51feb30a4b98525593169666/bootstraptest/test_insns.rb#L133

String can't be coerced into Integer (TypeError)

I'm trying to get my feet wet with tenderjit using my Linux environment with GCC but I'm getting this TypeError from misc/ruby_internals.rb, after some investigation I'm guessing that's something related to CLang vs GCC but honestly I'm not sure.

Configuration

Ruby 3.1.0-dev
Architecture x86_64 GNU/Linux

Error

$ bundle exec rake test
/home/<user>/.asdf/installs/ruby/3.1.0-dev/bin/ruby -I lib misc/build-ruby-internals.rb ca092
/home/<user>/tenderjit/misc/ruby_internals.rb:56:in `+': String can't be coerced into Integer (TypeError)
	from /home/<user>/tenderjit/misc/ruby_internals.rb:56:in `read_instruction_op_types'
	from /home/<user>/tenderjit/misc/ruby_internals.rb:119:in `process'
	from /home/<user>/tenderjit/misc/ruby_internals.rb:367:in `get_internals'
	from misc/build-ruby-internals.rb:431:in `<main>'
rake aborted!
Command failed with status (1): [/home/<user>/.asdf/installs/ruby/3.1...]
/home/<user>/tenderjit/Rakefile:43:in `block in <top (required)>'
/home/<user>/.asdf/installs/ruby/3.1.0-dev/bin/bundle:23:in `load'
/home/<user>/.asdf/installs/ruby/3.1.0-dev/bin/bundle:23:in `<main>'
Tasks: TOP => test => compile => lib/tenderjit/ruby/ca092/constants.rb
(See full trace by running task with --trace)

Debugging

def self.read_instruction_op_types symbol_addresses
  len  = RubyVM::INSTRUCTION_NAMES.length

  map = symbol_addresses.keys.grep(/^y\.\d+/).each do |key|
    insn_map = symbol_addresses.fetch(key)
    l = Fiddle::Pointer.new(insn_map)[0, len * Fiddle::SIZEOF_SHORT].unpack("S#{len}")
    break l if l.first(4) == [0, 1, 4, 7] # probably the right one
  end

  key = symbol_addresses.keys.grep(/^x\.\d+/).first
  op_types = symbol_addresses.fetch(key)

  str_buffer_end = map.last

  # op_types       => 123456789 (Integer)
  # str_buffer_end => y.5       (String)
  while Fiddle::Pointer.new(op_types + str_buffer_end)[0] != 0
    str_buffer_end += 1
  end
  Fiddle::Pointer.new(op_types)[0, str_buffer_end].unpack("Z*" * len)
end

@tenderlove Do you have any idea what could be wrong?

misc/ruby_internals.rb:225:in `initialize': No such file or directory

I'm getting this error after installing ruby 3.0.2 and running bundle exec rake test, it seems a tempfile is not being found:

> bundle exec rake test
/Users/pere/.rbenv/versions/3.0.2/bin/ruby -I lib misc/build-ruby-internals.rb 9b8c3
/Users/pere/code/tenderjit/misc/ruby_internals.rb:225:in `initialize': No such file or directory @ rb_sysopen - /private/var/folders/f7/sm33z6_s2hn5m_2wv8034ws40000gn/T/ruby-build.20210930231444.90464.AqRNEa/ruby-3.0.2/ast.o (Errno::ENOENT)
	from /Users/pere/code/tenderjit/misc/ruby_internals.rb:225:in `open'
	from /Users/pere/code/tenderjit/misc/ruby_internals.rb:225:in `block in each_object_file'
	from /Users/pere/code/tenderjit/misc/ruby_internals.rb:224:in `each'
	from /Users/pere/code/tenderjit/misc/ruby_internals.rb:224:in `each_object_file'
	from /Users/pere/code/tenderjit/misc/ruby_internals.rb:154:in `each_compile_unit'
	from /Users/pere/code/tenderjit/misc/ruby_internals.rb:100:in `process'
	from /Users/pere/code/tenderjit/misc/ruby_internals.rb:367:in `get_internals'
	from misc/build-ruby-internals.rb:431:in `<main>'
rake aborted!
Command failed with status (1): [/Users/pere/.rbenv/versions/3.0.2/bin/ruby...]
/Users/pere/code/tenderjit/Rakefile:43:in `block in <top (required)>'
/Users/pere/.rbenv/versions/3.0.2/bin/bundle:23:in `load'
/Users/pere/.rbenv/versions/3.0.2/bin/bundle:23:in `<main>'
Tasks: TOP => test => compile => lib/tenderjit/ruby/9b8c3/constants.rb
(See full trace by running task with --trace)

Make TenderJIT installable as a Gem

Right now TenderJIT needs to read DWARF information from Ruby and generate a cache of structs. This is the compile step in the Rakefile.

I would like to move this generation to ext/tenderjit/extconf.rb so that when TenderJIT is installed as a gem, it will automatically generate the struct information it needs to run.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.