Git Product home page Git Product logo

essentials-of-compilation's People

Contributors

akuhlens avatar andportnoy avatar arsdragonfly avatar arthurgleckler avatar capfredf avatar chuanwen avatar darshalshetty avatar destinyson7 avatar efanzh avatar ivanthetricourne avatar jbclements avatar jsiek avatar ksromanov avatar ncihnegn avatar noahstorym avatar onelharrison avatar peterthiemann avatar proglang avatar rrnewton avatar ryanglscott avatar stefanos82 avatar temyurchenko avatar txyyss avatar vikraman avatar vollmerm avatar vu3rdd avatar waynee95 avatar xc42 avatar zenspider avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

essentials-of-compilation's Issues

Section 4.8, `label->block` rather than CFG?

Section 4.8 uses the term "control flow graph", when really what we are building is a map from labels to basic blocks, which makes it confusing. Furthermore, the actual CFG is built in the register allocation section 4.10

Wouldn't label->block be a better phrasing here?

Missing documentation on running passes in chapter 5

It would be nice to include information on running the passes in chapter 5. For example, I had to figure out the way to run expose allocation is to use

("expose allocation" ,expose-allocation ,interp-Rvec-prime ,type-check-Rvec)

by looking around the codebase.

Inconsistent variable names across passes for running example in chapter 5

The variable names in different passes for the running example are different in chapter 5 (e.g. what is called collectret7974 in output of the expose allocation pass is called collectret9023 in the output of the select instruction pass). This makes it hard to follow what is going on, using consistent variable names would be better.

Document grammar "side conditions"

The grammar figures in the book don't currently explicitly mention side conditions like having a start label, or the characters allowable in identifiers (or even what is allowed in the "info" field). It would be good if each figure was a self-contained and detailed description of the grammar, including some of these extra constraints and definitions.

Backticks don't look like grave accent

Latex translates ` and ' into curved open and close single quotes. It took me a while to figure out in the code listings that the open single quotes were backticks. While I'm not sure if it covers all the cases, adding these two lines to the included packages in book.tex should make most of the backticks look like backticks, and apostrophes like apostrophes:
\usepackage{upquote}
\usepackage[straightquotes]{newtxtt}

Book seems to assume 8-byte aligned functions.

This isn't going to hold on all compilers, OS's, architectures (or even most).

Students ran into this with some difficult debugging:

Solution: just shift the darn procedure pointers when doing R7.

Suggestion for readable `make clean` command

Currently we have the following command

clean:
	$(LATEXMK) -C book
	rm -f book.log book.aux book.bbl book.lof book.out book.toc book.blg book.pdf book.ilg book.ind book.lot book.run.xml book.bcf book.fls book.fdb_latexmk authors.idx authors.ilg authors.ind subject.idx subject.ilg subject.ind

which is massive.

Is it OK with you if we could reduce it to

clean:
	$(LATEXMK) -C book
	$(RM) \
		*.idx *.ilg *.ind *.aux *.bbl *.bcf \
		*.blg *.fdb_latexmk *.fls *.lof *.log \
		*.lot *.out *.run.xml *.toc book.pdf

which looks cleaner and much more tight?

Is the `R_{Var}^{ANF}` language defined in Figure 2.12 too broad?

Consider the following example in R_{Var}

(+ 1 (+ 2 (+ 3 4)))

From the description of rco-exp and rco-atom, it appears that the book wanted the following translation to R_{Var}^{ANF}:

;; Code 1
(let ([tmp.1 (+ 3 4)])
  (let ([tmp.2 (+ 2 tmp.1)])
    (+ 1 tmp.2)))

However, there is another way to do this:

;; Code 2
(let ([tmp.1
       (let ([tmp.2 (+ 3 4)])
         (+ 2 tmp.2))])
  (+ 1 tmp.1))

which I believe is also a valid R_{Var}^{ANF} program. But I feel Code 2 is harder to evaluate, which would need a stack, while Code 1 can be evaluated without a stack.

Should we restrict the R_{Var}^{ANF} such that Code 1 is valid but Code 2 is invalid, for example, by not allowing e in (Let x e body) to contain any Let expression?

Figure 6.4 does not show callee save registers

This makes it mysterious that the compiler will produce rbp-offsets like "-40" for the first local variable, when the student is expecting a smaller offset (not taking callee-saves into account).

fix leaq in interpreter

after register allocation, but before patching, leaq should handle Deref in the target, but currently does not.

How to build the Racket version

make command defaults to Python on my system which currently misses chapters 9 to 12.

How do I tell it to build the Racket version which is complete?

issues with rbp/rsp in chapter 6

Some questions came up last week in class about chapter 6 and figure 6.4. The figure shows "rsp" as storing the base address for the local m in the callee, but it says rbp for local 1. That seems like a discrepancy.

In the functions.rkt reference implementation, there are comments about the stack-arg form possibly being obsolete. The stack-arg references relative to rsp, whereas the code generated in the preamble of a function def references rbp.

Compilation error.

Hello.

When compiling the head of master as of 2021-05-11 on an XUbuntu 20.04 x86_64 LTS machine with a texlive-full installed, I get:

! Use of \\titlecap doesn't match its definition.
\update@series@target@value #1->\def \reserved@a {
                                                  #1}\ifx \target@meta@famil...
l.402 ...tract Syntax Trees and Racket Structures}

Thank you!

Typo in the description of assign-homes compiler pass

Regarding the typo "...we assignment of each..." in the description of the assign-homes compiler pass shown below.

\item[Pass \key{assign-homes}] To handle the difference between the
variables in $R_1$ versus the registers and stack locations in x86,
we assignment of each variable to a register or stack location.

I wasn't sure what phrasing might be best here, but I was thinking "...we map/translate the assignment of each variable..."

Book is self-contradictory about whether (and e1 e2) short-circuits

In one part of Chapter 3, it states:

The logical operations not and and behave as you might expect, but note that the and operation is short-circuiting. That is, the second expression e2 is not evaluated if e1 evaluates to #f.

But in another part of Chapter 3:

Recall that the and operator of C1 does not perform short circuiting, but evaluates both arguments unconditionally.

Which is right?

Confusion about Sec. 5.3.4 (Register Allocation in R3)

Hi,
I'm a bit confused about why the second requirement is there for the register allocator("if a vector-typed variable is live during a call to the collector, it must be spilled to ensure it is visible to the collector"). Doesn't requirement 1 already ensure that vector-typed variables are always spilled to the root stack?

confusing example for `rco-atom` in section 2.5

I'm following the book, and in section 2.5, Remove Complex Operands, the example for rco-atom is to transform (- 10) to tmp.1:

image

I was confused as (- 10) is already atomic (it's a (Prim - (Int 10))). Then I went to check the lecture notes, there the example is actually:

(+ (+ 42 10) (- 10))
=>
(let ([tmp.1 (+ 42 10)])
  (let ([tmp.2 (- 10)])
    (+ tmp.1 tmp.2)))

which makes sense as (- 10) is the operand here hence needs to be transformed to atomic.

Perhaps we can use the same example in the book text?

Struct-based syntax?

Why do we use S-expressions for syntax? It has a certain convenience, but the students pay for it in any number of small ways. How many times do they end up with "extra parentheses", because non-type-safe operations are applied to expressions in list form?

If anything, struct-based pattern matching is actually cleaner in racket than quasiquoted list patterns:

#lang racket
(struct foo (a b))
(match (foo 1 2)
  [(foo a b) a])

Struct based ASTs would be a major change to the book, so it shouldn't be undertaken lightly. As a stop-gap I have recommended that students make more defensive helper functions like "append two lists of instructions" that use a contract to (deeply) check the validity of their inputs.
Presently, list-based S-expressions combine with the slightly underspecified grammars (#29) to create deviations between the student compilers and the textbook (more importantly, the expectations of the testing-framework / interpreters).

Disadvantages / Alternatives

Aside from the cost of the change itself:

  • Reading input programs from SExpressions would require a basic "parser" pass that convers sexp->structs. This is probably a good practice anyway, like the old "parse-scheme" pass.

Erroneous patch instructions

In section 4.11 patch instructions it is suggested to use a temporary mov into rax when the second argument to a cmpq is an immediate. However, it is possible that rax is live at this point and this mov will overwrite the value in rax.

One possible solution is to do this temporary mov during the select instructions pass, so that the liveness analysis pass (which happens after select instructions, but before patch instructions) takes care of such situations.

A possible error in Chapter 2.7

The original text said

(movq (int 10) (var tmp.1))
(negq (var tmp.1))
(movq (var tmp.1) (var tmp.2))
(addq (int 52) (var tmp.2))
(movq (var tmp.2) (reg rax)))

The variable tmp.1 is assigned to stack location -8(%rbp), and tmp.2 is
assign to -16(%rbp), so the assign-homes pass translates the above to

(movq (int 10) (deref rbp -16))
(negq (deref rbp -16))
(movq (deref rbp -16) (deref rbp -8))
(addq (int 52) (deref rbp -8))
(movq (deref rbp -8) (reg rax)))

But shouldn't it be

(movq (int 10) (deref rbp -8))
(negq (deref rbp -8))
(movq (deref rbp -8) (deref rbp -16))
(addq (int 52) (deref rbp -16))
(movq (deref rbp -16) (reg rax)))

?

Is the grammar in exercise 1 correct?

Pr #26 added an int alternation to the exp non-terminal, I don’t think it is correct.

I think I have a solution that doesn’t require the int alternation in the exp non-terminal.

Click here to view my solution. (I am not sure that whether I should put my solution here. But how should I explain myself?)

The idea is that we flatten the expression tree into an addition chain. Then partition the terms in the chain into two groups:

  • The group that only contains ints.
  • The group that only contains (read)s and (- (read))s.

Each group could be empty. Then we compute the sum of the first group, and add the second group to the result.

The exp non-terminal corresponds to the second group in my solution, which doesn’t need int alternation since they are all in the first group.

Is there something I missed? Is the int alternation really necessary?

Eliminate dotted pairs from the grammars

This causes additional confusion with the students, and it really has no benefit.

IMHO dotted pairs should only be used if the cdr is non-list data (and really not even then because there's always a better data structure). One major problem is that we currently lose read-write read/write invariance for ASTs:

> '((start . (seq _ (return _))))
'((start seq _ (return _)))

Aside from being ugly, "(start seq _" requires the students understand in detail the source, memory, and printed representations of cons-lists, and distinctions between them.

Eliminating dotted pairs would be a relatively small change (in contrast w the larger overhaul proposed by #30) would be to just eliminate dotted pairs and use proper lists.

Cannot compile it due to an error with biber

I am on Debian testing 64-bit and got the following error while running make:

Running 'makeindex  -o "book.ind" "book.idx"'
------------
This is makeindex, version 2.15 [TeX Live 2022/dev] (kpathsea + Thai support).
Scanning input file book.idx...done (0 entries accepted, 0 rejected).
Nothing written in book.ind.
Transcript written in book.ilg.
Latexmk: applying rule 'biber book'...
Rule 'biber book': File changes, etc:
   Changed files, or newly in use since previous run(s):
      'book.bcf'
------------
Run number 1 of rule 'biber book'
------------
------------
Running 'biber  "book.bcf"'
------------
DateTime.c: loadable library and perl binaries are mismatched (got handshake key 0xce00080, needed 0xed00080)
Latexmk: Summary of warnings from last run of *latex:
  Latex failed to resolve 324 reference(s)
  Latex failed to resolve 149 citation(s)
Latexmk: Errors, so I did not complete making targets
Collected error summary (may duplicate other messages):
  biber book: Could not open biber log file for 'book'
Latexmk: Use the -f option to force complete processing,
 unless error was exceeding maximum runs, or warnings treated as errors.
make: *** [Makefile:7: all] Error 12

biber version is 2.16-1 and texlive version is 2021.20210921-1.

Which versions do I have to have installed in order to compile the book successfully?

P.S.: Three generated files are not included in .gitignore thus polluting the locally cloned repository:

$ git status
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	book.fdb_latexmk
	book.fls
	book.pdf

nothing added to commit but untracked files present (use "git add" to track)

Assembly in figure 5.15 needs refresh

It is out of date relative to what the compiler produces. And it has some elements that are confusing people.

E.g. where did movq %rbx, 0(%r15) come from?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.