Compilation takes quite a while even for small files. Is this something that can be fi

Here's some slightly better numbers: <div class="snippet-clipboard-content notrans

Compilation speed,about google/schism

Comments (19)

eholk commented on May 5, 2024 1

I like the simplicity of that one, although I think it'd be better to stick closer to the R6RS specification of byte vectors: http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-3.html

I'd start by just implementing the functions needed by the compiler itself, which I think is basically just creating byte vectors and then bytevector-u8-ref and bytevector-u8-set!. Strictly speaking, we probably don't really even need bytevector-u8-ref.

from schism.

eholk commented on May 5, 2024 1

I've been using % generally to signify that it is a compiler intrinsic, not part of any Scheme standard, and frequently unsafe. Basically, things that users shouldn't be using directly, but rather things that the compiler uses to implement the user facing features. I'd probably do %time-start in this case, although I'm fine with either way.

from schism.

eholk commented on May 5, 2024 1

Bootstrapping the compiler takes less than 10 seconds now, so I'd consider this fixed.

We'll want to keep an eye on this in case it gets out of control again, but I think we're good for now.

from schism.

eholk commented on May 5, 2024

I've been sad about compile time lately too. This is something that should be fixable in Schism. I have a few ideas.

I think one of the easier options that will help most is changing how Schism writes its output. Right now this happens byte by byte, by calling write-char in the runtime over and over. This means we're paying the cost of transitioning from Wasm to JS for every single byte of output, which I expect is significant. A better option would be to add support for bytevectors and let Schism write out the whole bytevector at once, or at least write out larger chunks at a time.

The other thing I'd like to see is to have Schism start doing some optimizations. For example, inlining, constant folding and constant propagation will probably have a pretty strong effect.

from schism.

matthewp commented on May 5, 2024

Is there a specification of byte vectors that you have in mind? Otherwise this API seems good to me: https://srfi.schemers.org/srfi-66/srfi-66-1.3.html

from schism.

matthewp commented on May 5, 2024

I tested how long it takes to do the writing. I went from the first byte written until compilation was complete and then compared that to the total compilation time. This is what I got running a test:

write: 1.069ms
compile: 64749.019ms
  stage1 succeeded
write: 0.913ms
compile: 67400.288ms
  stage2 succeeded

I pushed the changes to a commit here: matthewp@7e91d22 if you want to verify I didn't do something wrong.

I then tested against a pet project using the stage0 snapshot:

write: 4.725ms
compile: 3063.516ms

So from this I take it that writing is not a significant part of the problem.

Side note, 3 seconds is not bad! So that's good news. Is there a reason run-schism.mjs uses the stage1 compiler? Depending on why that is I say we either make it use stage0 or provide a cli flag to use a specific stage. Happy to submit a PR for any of the above changes :)

from schism.

eholk commented on May 5, 2024

Thanks for gathering this data! I guess my theory that the time to write everything out was the problem is wrong. Good to know :)

I had run-schism.mjs default to stage1 because during development stage1 tends to support more features than stage0 (basically, anything that's been added since the last snapshot). I'm fine with changing the default though. The compiler is more fully featured now, and I tend to update the snapshot as soon as I add a feature. Having command line options would be awesome!

Do you think you'd be interested in breaking down the timing for the rest of the compiler passes? I'd suggest exposing some version of console.time and console.timeEnd to Schism as a runtime function, and then add calls to it around each of the compiler passes. Some more information about what's going on would be very helpful!

from schism.

matthewp commented on May 5, 2024

Ok, I'll submit a PR later that changes the default to stage 0 and adds a --stage flag.

Exactly, adding a time-start and time-end so we can break down the compiler passes was going to be the next thing I did here. By the way, should it be %time-start? Does the % signify that it is side-effectual?

from schism.

matthewp commented on May 5, 2024

Need to do some more investigation, but just as a start these numbers are for compiling the stage 2 compiler:

parse-library: 13252.056ms
convert-closures: 24765.116ms
functions->names: 8.949ms
functions->types: 26.728ms
functions->type-ids: 34.074ms
compile-functions: 14048.685ms
compile-library->module-package: 52260.581ms

matthewp@3bd9037

from schism.

matthewp commented on May 5, 2024

Here's some slightly better numbers:

parse-library: 14168.960ms
convert-closures: 26090.513ms
functions->names: 9.552ms
functions->types: 25.474ms
functions->type-ids: 33.270ms
compile-functions: 15084.660ms
build-exports: 9.574ms
gather-imports: 10.181ms
compile-library->module-package: 55539.606ms

matthewp@b3b79bc

Probably not a surprise that parsing, creating closures, and compiling are the most expensive parts.

@eholk Anything particular interesting from this data? Am I timing the right things? Any of these you'd be interested in my drilling down and getting numbers on?

from schism.

eholk commented on May 5, 2024

Awesome! Thanks for getting these numbers. It's nice to see that some of the simpler passes are in fact quite cheap to run.

I was thinking some more about the cause of the slowness, and I think a lot of it has to do with how symbols are compiled. Basically, if Schism sees 'foo, this gets translated in place to (string->symbol "foo"), which gets further translated to (string->symbol (list->string '(#\f #\o #\o))), which then becomes (string->symbol (list->string (cons #\f (cons #\o (cons #\o '()))))).

The most expensive part is the string->symbol step, because it has an O(n) traversal of the symbol table. The worst part is that none of this is able to be cached or even recompiled. Every check of a tag throughout the compiler involves creating a brand new symbol from a string.

There's a couple of ways to address this.

The proper solution is to generate a data section with all the symbol data pre-computed and then replace 'foo with the actual symbol pointer value. This is a fair amount of work though.

A shorter term solution, which also might help validate the hypothesis is to do a manual tagging system. Basically, instead of having a bunch of cond clauses like (eq? tag 'if), we'd replace those with (eq? tag (if-tag)), where if-tag is somewhere defined like (define (if-tag) 0), and we have similar things like (define (call-tag) 1), and so on. I think this approach will be less maintainable in the long run, but it should work to validate the hypothesis that tag construction is taking the bulk of the time.

Actually, an even simpler option would be to track to cumulative time spent in string->symbol.

from schism.

matthewp commented on May 5, 2024

Thanks @eholk! I don't think I'm capable of implementing the proper solution at this time but happy to try the other approaches. I'll start by measuring the cumulative in string->symbol and report back to you.

By the way, have you ever tried converting the compiler's wasm to wat format and inspect that? I looked at it briefly but wasn't able to see is that there are indeed a lot of calls to cons, but I guess that is to be expected for any lisp language. Worth looking at if you have some time.

from schism.

eholk commented on May 5, 2024

I've spent a little time looking at the disassembly, but usually just to fix issues with generating correct Wasm bytes. I'm not surprised about the large number of calls to cons, especially because the compiler inserts quite a few of its own.

I'd like to add a pass that does a combination of inlining, constant folding, constant propagation, and dead code elimination. I expect this will inline some of the calls to cons, although it'll probably just move the work around. In some cases, it may be able to entirely remove an allocation.

from schism.

matthewp commented on May 5, 2024

@eholk Can confirm your theory, see below:

parse-library: 13714.420ms
convert-closures: 25603.515ms
functions->names: 9.427ms
functions->types: 26.791ms
functions->type-ids: 35.433ms
compile-functions: 14860.114ms
build-exports: 9.707ms
gather-imports: 11.063ms
compile-library->module-package: 54382.075ms
[string->symbol] total: 52388.65443624556ms

So the majority of time spent is in string->symbol. If you want to check my work: matthewp@e9bbfc7

from schism.

eholk commented on May 5, 2024

Awesome, this data is really useful!

I think the easiest thing in the short term is to stop comparing against quoted symbols and just use integers. That'll make the compiler faster until we have proper support for static data.

from schism.

matthewp commented on May 5, 2024

About that idea, I admit that I don't quite get how it works. Are you saying to replace the quoted symbols in parse-expr and compile-expr with static integers? Anywhere else?

from schism.

eholk commented on May 5, 2024

Yeah, in parse-expr, compile-expr, and pretty much everywhere else.

Basically, before parse-expr (I think all that's before parse-expr is the reader, macro expander, and quote/quasiquote expander), we'd still use symbols, since we're processing human readable input. After parse-expr, the idea is that program is converted to a form that's easier for the compiler to handle.

So for example, take the number line in parse-expr. This would get rewritten from:

((number? expr) `(number ,expr))

to:

((number? expr) `(,(number-tag) ,expr))

number-tag would be a function that just returns a number that's different from all the other tags, so something like:

(define (number-tag) 42)

Then anywhere we compare against 'number, we'd instead compare against (number-tag). Take line 807 for example, in apply-representation-expr:

((eq? tag 'number) `(ptr ,(bitwise-arithmetic-shift-left (cadr expr) (tag-size))))

This would get rewritten as:

((eq? tag (number-tag)) `(,(ptr-tag) ,(bitwise-arithmetic-shift-left (cadr expr) (tag-size))))

Does that make more sense? It's unfortunately a pretty big change, but it's mostly find-and-replace.

from schism.

matthewp commented on May 5, 2024

Yeah that does make sense. I think I was more confused about where not to make the change because, as you said, before parse-expr its human readable code.

I think I can try my hand at this. My only other question is, will there be a problem with making all of the changes at once? When adding functions you need to write the function before actually using it in the compiler (iirc, I tried this once), but inside of the parser I'm not sure... my brain can't figure out if I need to work on one function first and then do others or not.

from schism.

eholk commented on May 5, 2024

If you want to try doing this, that would be awesome!

Making the changes all at once shouldn't be a problem. In fact, you pretty much have to, or else you'd have to add a step to convert between the new representation and the old one for the places that haven't been updated. I wish it were possible to do this a little more piecemeal.

You shouldn't have to make any changes to the primitives list, and this shouldn't change the syntax the compiler supports, so staging shouldn't be a problem. Basically, any functions you add as top-level functions in compiler.ss are immediately available to the compiler. With primitives (i.e. these functions), you have to wait for a new snapshot to be able to use those in the compiler.

from schism.

Compilation speed about schism HOT 19 CLOSED

Comments (19)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent