jserv / shecc Goto Github PK

A self-hosting and educational C optimizing compiler

License: BSD 2-Clause "Simplified" License

Makefile 1.37% C 92.71% Shell 5.92%

arm armv7 compiler self-hosting c linux risc-v rv32i rv32im riscv qemu cross-compiler elf compiler-optimization

shecc's Issues

Unable to bootstrap due to corrupted libc.inc

In the latest commit (cb34939), after using command make distclean config ARCH=arm then executes make to bootstrap, the process panicks about the corrupted libc.inc generated by tools/inliner.c. The generated corrupted libc.inc could be seen here.

Here's the sneak peek of the corrupted libc.inc, notice that this corruption is caused due to unmatched string double quotation mark:

/* Created by tools/inliner - DO NOT EDIT. */
void __c(char *src) {
    for (int i = 0; src[i]; i++)
        SOURCE[source_idx++] = src[i];
}
void libc_generate() {
  __c("/*
\n");
  __c(" * shecc - Self-Hosting and Educational C Compiler.
\n");
  __c(" *
\n");
  __c(" * shecc is freely redistributable under the BSD 2 clause license. See the
\n");
  __c(" * file \"LICENSE\" for information on usage and redistribution of this file.
\n");
  __c(" */
\n");
  __c("
\n");

I can only reproduce this result on current laptop's WSL2 but on previous desktop's WSL2.

Support Arm/RISC-V targets without integer division instructions

Some Arm targets such as Cortex-A9 might lack of integer division instructions, and the current shecc would not generate the correct instruction sequence for them. Consequently, we have to provide the fallback by means of software emulation. Here is the reference implementation for integer division:

__div:
        push  {fp, lr}
        add   fp, sp, #4
        // check for divide by zero
        cmp   r1, #0
        beq   __div_end
        push  {r0, r1}
        // variables
        mov   r0, #0        // quotient
        pop   {r1, r2}       // dividend / remainder, divisor
        mov   r3, #1        // bit field
 __div_shift:
        // shift divisor left until it exceeds dividend
        // the bit field will be shifted by one less
        cmp   r2, r1
        lslls r2, r2, #1
        lslls r3, r3, #1
        bls   __div_shift
__div_sub:
        // cmp sets the carry flag if r1 - r2 is positive, which is weird
        cmp   r1, r2
        // subtract divisor from the remainder if it was smaller
        // this also sets the carry flag since the result is positive
        subcs r1, r1, r2
        // add bit field to the quotient if the divisor was smaller
        addcs r0, r0, r3
        // shift bit field right, setting the carry flag if it underflows
        lsrs  r3, r3, #1
        // shift divisor right if bit field has not underflowed
        lsrcc r2, r2, #1
        // loop if bit field has not underflowed
        bcc   __div_sub
__div_end:
        sub   sp, fp, #4
        pop   {fp, pc}

Reference:

Simple division algorithm for ARM assembler
Assembly Language - Division
Pull request #56

Can't define symbol as string

Current shecc can't define symbol as string, as a result, #47 fail to refactor.

#define ARCH_PREDEFIND "__arm__"

Above expression is invalid, later patch will fix it.

Simplify function global_init() with global variable initializations

This function global_init was required because the global variable initialization was not supported. However, #13 did overcome the issue, and it would be great to simplify the function global_init and the corresponding statements in file src/globals.c.

Thoughts on cfront's potential improvements

Currently cfront is using scanless parser with IR emitter binds into it, and it contains ~3000 LOC. But based on my contributions experience to industrial grade programming languages (V Lang in this case), shecc's frontend parser is lack of ease to either debug or for others to contribute (Even though shecc is meant to be educational).

Here's a list of my thoughts on improving the frontend of shecc:

Discard scanerless parser, rewrite it into both lexer and parser.
Introducing Abstract Syntax Tree for better IR emission and backend generation.
Separation on the compilation phases into multiple files would be a better idea for potential contributors to learn shecc's architecture.

And by accepting the suggestion described above, the possible major changes would be:

Having more than 3 phases compilation in shecc.
The code generation logic must rewrite based on the introduced AST.

Unable to self compile stage 1

In commit 5f4f6d6, by applying the following command make distclean config ARCH=arm and then make to build stage 1 compiler, the compiler will report unexpected token error at unreasonable position:

Error Unexpected token at source location 20771
    ref_block_t *head, *tail;
                       ^ Error occurs here

Significantly increased bootstrap time in recent build

@vacantron 's analysis indicates that the previous bootstrap took 10 seconds; with the introduction of #131, it increases to 40 seconds. However, if the free() function's body is emptied and its corresponding actions are omitted, the time reduces to 3.5 seconds.

Not enough register for reading expression

There's a test case:

a = 1 + (2 + (3 + (4 + (5 + (6 + (7 + (8 + (9 + (10 + (11 + (12 + (13 + (14 + (15 + (16 + (17 + (18 + (19 + 20))))))))))))))))))

In shecc, it will store all constants in register. Obviously, there's not enough register.
In modern compiler, I found they just load an immediate. e.g.

li a4, 210

Even I mix identifier into expression, modern compiler still load an immediate value. Do we need to rewrite read expression? Maybe we need to evaluate expression before run time. Also, I think we need symbol table to store these identifier.

Validate the prerequisites when "make config" is executed

The build target "config" should not only specify the target for code generation (Arm by default) but also check the availability of prerequisites such as qemu-arm and qemu-riscv32.

Memory for shecc is not enough again

Due to the increasing of code size of shecc, limitations defined in defs.h don't not work anymore.

Improve intermediate representation and also register allocation

Description

The current register allocator uses the strategy of the stack-based virtual machine and is connected to the generation of the intermediate representation. These might result in the inefficient utilization of the available registers and make the optimization harder.

To improve these, we could decouple the IR generation and the allocator, and introduce the new algorithm (e.g. linear-scan, graph coloring, etc.) to support the register-based virtual machine model.

Related issues

High branch-miss rate when hosting shecc on the Raspberry Pi 3B

The built executable file (here is shecc itself at 2b5d7b6) has much higher branch-miss rate (19.91%) compare with GCC built (5.74%) when running on the Raspberry Pi 3B. The branch misprediction might potentially slowdown the execution.

How to reproduce

Build 2b5d7b6

$ make config
$ make out/shecc
$ out/shecc -o out/shecc-stage1.elf ../shecc-2b5d7b6/src/main.c
$ chmod +x out/shecc-stage1.elf

Measure with perf

$ perf stat -r 5 -e branches,branch-misses out/shecc-stage1.elf tests/fib.c

Get the statistics

 Performance counter stats for 'out/shecc-stage1.elf tests/fib.c' (5 runs):

         2,308,457      branches:u                #   33.270 M/sec                    ( +-  0.00% )
           459,550      branch-misses:u           #   19.91% of all branches          ( +-  0.02% )

         0.078757 +- 0.000616 seconds time elapsed  ( +-  0.78% )

Related issues

Why are there so many spaces in the first line of arm.mk and riscv.mk

In arm.mk:

ARM_EXEC = qemu-arm

And in riscv.mk:

RISCV_EXEC = qemu-riscv32

Is there any teaching materials available with shecc

Hi @jserv

I'd like to use shecc in my personal compiler MOOC.
I am using as textbook, and RISC-V as target backend.
Is there any teaching materials available with shecc now?
If not, is there a way to contribute?

Enhance the implementation of division emulation in the Arm backend

Currently, the division emulation in the Arm backend has two potential issues to be resolved:

Enhance code reusability:
There are only 2 instructions different for div/mod emulation, and it can be refined to reuse the same code.

    /* Obtain absolute values of dividend and divisor */
    emit(__srl_amt(__AL, 0, arith_rs, __r8, rn, 31));
    emit(__add_r(__AL, rn, rn, __r8));
    emit(__eor_r(__AL, rn, rn, __r8));
    emit(__srl_amt(__AL, 0, arith_rs, __r9, rm, 31));
    emit(__add_r(__AL, rm, rm, __r9));
    emit(__eor_r(__AL, rm, rm, __r9));
-   emit(__eor_r(__AL, __r10, __r8, __r9));
+   emit(__mov_r(__AL, __r10, __r8));
    /* ... */
-   emit(__mov_r(__AL, rd, __r9));
-   /* Handle the correct sign for quotient */
+   emit(__mov_r(__AL, rd, rn));
+   /* Handle the correct sign for remainder */
    emit(__cmp_i(__AL, __r10, 0));
    emit(__rsb_i(__NE, rd, 0, rd));

There has another potential issue for the current implementation: the dividend and divisor will lose after the emulation.

If observing the implementation, rn and rm, which storing the dividend and divisor, will be shifted to complete the emulation, and the original value will lost at the end. But, for example, an executable compiled by shecc may has the following instruction sequence:
- with +m option (use hardware div/mod instruction)
```
.....
.....
div r2, r0, r1 
ldr r3, r0           ; Load a word by r0 (the dividend).
.....
```
- without +m option (use div/mod emulation)
```
....
asr     r8, r0, #31
add     r8, r8, r0
eor     r8, r8, r0
.....
.....
cmp     sl #0
rsbne   r2, r2, #0
ldr     r3, r0        ; Load a word by r0 (the dividend). 
......                ; But r0 becomes no meaning because the emulation.
......
```
After a division, the subsequent insturction(s) may use the register storing the dividend (or divisor) to perform the operation. With hardware instruction support, it will have no any problem. However, it may be failed for the subsequent instruction(s) because rn and rm become no meaning after the emulation.

Therefore, the implementation should be improved to restore the dividend and divisor after the emulation.

The peephole optimization breaks the macro expansion

Considering the following test case:

#define M(a, b) a + b

int main()
{
    return M(1, 2) * 3;
}

The stage-1 shecc (i.e. out/shecc-stage1.elf, testing with command: qemu-arm out/shecc-stage1.elf --no-libc <file>) reports:

Error Unexpected token at source location 20
#define M(a, b) a + b
                    ^ Error occurs here
Abnormal program termination

The current test driver runs on the stage-0 shecc and doesn't detect this error.

Eliminate compilation warnings

There are several compilation warnings during the generation of the cross-compiler, as partially shown below.

src/ssa.c: In function ‘bb_build_df’:
src/ssa.c:231:24: warning: unused parameter ‘fn’ [-Wunused-parameter]
  231 | void bb_build_df(fn_t *fn, basic_block_t *bb)
      |                  ~~~~~~^~
...
src/ssa.c:771:36: warning: format ‘%p’ expects argument of type ‘void *’, but argument 3 has type ‘basic_block_t *’ {aka ‘struct basic_block *’} [-Wformat=]
  771 |     fprintf(fd, "subgraph cluster_%p {\n", bb);
      |                                   ~^       ~~
      |                                    |       |
      |                                    void *  basic_block_t * {aka struct basic_block *}
...
src/ssa.c:1056:35: warning: unused parameter ‘fn’ [-Wunused-parameter]
 1056 | void bb_reset_live_kill_idx(fn_t *fn, basic_block_t *bb)

Integrate with semu

Port semu to the shecc subset of c
Use semu instead of qemu in bootstrapping

Uninitialized variable: pred

Reported by Cppcheck:

rc/ssa.c:180:33: warning: Uninitialized variable: pred [uninitvar]
                if (bb->idom != pred) {
                                ^
src/ssa.c:163:31: note: Assuming condition is false
                for (i = 0; i < MAX_BB_PRED; i++) {
                              ^
src/ssa.c:180:33: note: Uninitialized variable: pred
                if (bb->idom != pred) {
                                ^

Doesn't support break in while loop

It is not valid now:

 while(1){
    break;
}

Handle non-zero integers in if statements

In C, non-zero integers are considered truthy, implying that they will be evaluated as true in if statements and similar constructs. However, shecc does not accurately account for this behavior.

Example:

--- a/src/cfront.c
+++ b/src/cfront.c
@@ -1968,7 +1968,7 @@ int eval_expression_imm(opcode_t op, int op1, int op2)
         /* TODO: provide arithmetic & operation instead of '&=' */
         /* TODO: do optimization for local expression */
         tmp &= (tmp - 1);
-        if ((op2 != 0) && (tmp == 0)) {
+        if (op2 && (tmp == 0)) {
             res = op1;
             res &= (op2 - 1);
         } else

The alteration mentioned above results in an incomplete bootstrap.

Support macros defined in <stdbool.h>

C99 introduces a built-in type called _Bool for handling Boolean values. Additionally, the standard header <stdbool.h> defines macros bool, true, and false to facilitate boolean operations.

As per C99's specifications, except for bit-fields, all data types, including bool, must be allocated at least one byte of memory. This allocation ensures that objects consist of contiguous byte sequences, the specifics of which depend on the implementation or are explicitly stated. Since a char's size is universally one byte, it follows logically that the size of a bool must also be one byte, at most. In summary, based on C99 standards, the size of a bool is definitively set to one byte.

Incorporating the bool type into shecc would enhance the expressiveness of source code and improve memory usage when expressions solely need to represent binary relationships.

C99 6.2.6.1 General

Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined.

C99 6.3.1.1 Boolean, characters, and integers

The rank of _Bool shall be less than the rank of all other standard integer types.

Fail to pass stage1

commit 5979cb8 introduces preliminary support of C99 _Bool. However, it fails to pass stage1:

$ make 
  SHECC	out/shecc-stage1.elf
Aborted (core dumped)
make: *** [Makefile:80: out/shecc-stage1.elf] Error 134

Tested on Ubuntu 20.04.6 LTS.

Fail to self-host

The stage-1 and stage-2 ELF are not identical. Check them with the following diff command:

$ diff <(arm-linux-gnueabihf-objdump -d out/shecc-stage1.elf) <(arm-linux-gnueabihf-objdump -d out/shecc-stage2.elf)

output:

35856,35861c35856,35861
<    33074:     e59d5000        ldr     r5, [sp]
<    33078:     e58d50bc        str     r5, [sp, #188]  ; 0xbc
<    3307c:     e59d6000        ldr     r6, [sp]
<    33080:     e58d60c0        str     r6, [sp, #192]  ; 0xc0
<    33084:     e59d7000        ldr     r7, [sp]
<    33088:     e58d70c4        str     r7, [sp, #196]  ; 0xc4
---
>    33074:     e59d6000        ldr     r6, [sp]
>    33078:     e58d60bc        str     r6, [sp, #188]  ; 0xbc
>    3307c:     e59d7000        ldr     r7, [sp]
>    33080:     e58d70c0        str     r7, [sp, #192]  ; 0xc0
>    33084:     e59d2000        ldr     r2, [sp]
>    33088:     e58d20c4        str     r2, [sp, #196]  ; 0xc4
36201,36206c36201,36206
<    335d8:     e59d5000        ldr     r5, [sp]
<    335dc:     e58d50bc        str     r5, [sp, #188]  ; 0xbc
<    335e0:     e59d6000        ldr     r6, [sp]
<    335e4:     e58d60c0        str     r6, [sp, #192]  ; 0xc0
<    335e8:     e59d7000        ldr     r7, [sp]
<    335ec:     e58d70c4        str     r7, [sp, #196]  ; 0xc4
---
>    335d8:     e59d6000        ldr     r6, [sp]
>    335dc:     e58d60bc        str     r6, [sp, #188]  ; 0xbc
>    335e0:     e59d7000        ldr     r7, [sp]
>    335e4:     e58d70c0        str     r7, [sp, #192]  ; 0xc0
>    335e8:     e59d2000        ldr     r2, [sp]
>    335ec:     e58d20c4        str     r2, [sp, #196]  ; 0xc4
36463,36468c36463,36468
<    339f0:     e59d5000        ldr     r5, [sp]
<    339f4:     e58d50bc        str     r5, [sp, #188]  ; 0xbc
<    339f8:     e59d6000        ldr     r6, [sp]
<    339fc:     e58d60c0        str     r6, [sp, #192]  ; 0xc0
<    33a00:     e59d7000        ldr     r7, [sp]
<    33a04:     e58d70c4        str     r7, [sp, #196]  ; 0xc4
---
>    339f0:     e59d6000        ldr     r6, [sp]
>    339f4:     e58d60bc        str     r6, [sp, #188]  ; 0xbc
>    339f8:     e59d7000        ldr     r7, [sp]
>    339fc:     e58d70c0        str     r7, [sp, #192]  ; 0xc0
>    33a00:     e59d2000        ldr     r2, [sp]
>    33a04:     e58d20c4        str     r2, [sp, #196]  ; 0xc4

need to optimize symbol lookup?

I used uftrace to analyze the flow and performance of shecc and found the most time shecc spent is strcmp():
$ uftrace record out/shecc src/main.c # check stage0 for now, since stage1 didn't support something like gcc -pg
$ uftrace report
Total time Self time Calls Function
========== ========== ========== ====================
166.474 ms 0.776 us 1 main
...
41.193 ms 41.193 ms 1008573 strcmp //<- for stage1, this would be even higher since strcmp() is naive

It's not too hard to realize this since strcmp() was used a lot by find_func() and other similar functions (with linear search). Do you think we need to use some kind of dictionary to optimize this? (I can add it. :-) )

After all, hash table or trie might not be too complicated, we can still have the educational purpose. Or this is intentional for students to add?

Unable to bootstrap when "init_val != 0" is reduced into "init_val"

Reproducible with the following change:

--- a/src/codegen.c
+++ b/src/codegen.c
@@ -77,7 +77,7 @@ void size_funcs(int data_start)
                        data_start + elf_data_idx);
         /* TODO: add .bss section */
         if (strcmp(blk->locals[i].type_name, "int") == 0 &&
-            blk->locals[i].init_val != 0)
+            blk->locals[i].init_val)
             elf_write_data_int(blk->locals[i].init_val);
         else
             elf_data_idx += size_var(&blk->locals[i]);

Stage 2 of the bootstrapping would fail.

Questions about Op Priority

In "get_operator_prio(opcode_t op)" in line 836, cfront.c,
what is the reason you put lower priority Op in the front and lower priority Op in the rear?

If for reducing the clock cycle,
isn't the way should be putting the most used Operation in the front?

Implement short-circuit evaluation of `&&` operator

According to the C99 standard (6.5.13):

The && operator shall yield 1 if both of its operands compare unequal to 0; otherwise, it yields 0. The result has type int.
Unlike the bitwise binary & operator, the && operator guarantees left-to-right evaluation; there is a sequence point after the evaluation of the first operand. If the first operand compares equal to 0, the second operand is not evaluated.

The latter operand should not be evaluated if the former is the expression of 0.

Support conversion specifier “%c” inside printf

It would be good if we can print out and see the character at some points especially on debugging. Also, it is an improvement of the functionality of printf function.

@jserv @eecheng87 May I push a new commit?

RISC-V code generation

Per the request of @lazyparser, shecc would support RISC-V code generation and still self-host.

Unable to parse the specific macros

With the discussion in the pull request (#140) , we found that shecc cannot deal with macros containing assignment and compound assignment operators.

Here are the examples:

#define MACRO1(variable, val) \
               variable = variable + val + 10
#define MACRO2(variable, val) \
               variable += val + 10

By GCC or Clang, the above macros will be expanded and perform the operations, but shecc cannot parse them correctly.

Therefore, the lexer/parser must be improved to handle macros like these examples.

Refactor architecture specific configurations

Instead of #43, I would propose an elegant way to specify the architecture and its configurations. The WIP changes are shown as following:

diff --git a/Makefile b/Makefile
index 9d3e9ca..8a31660 100644
--- a/Makefile
+++ b/Makefile
@@ -20,21 +20,22 @@ OBJS := $(SRCS:%.c=$(OUT)/%.o)
 deps := $(OBJS:%.o=%.o.d)
 TESTS := $(wildcard tests/*.c)
 TESTBINS := $(TESTS:%.c=$(OUT)/%.elf)
-TARGET_EXEC := `cat $(OUT)/target`
 
 all: config bootstrap
 
-config:
-ifeq (riscv,$(ARCH))
-	@$(VECHO) "$(RISCV_EXEC)" > $(OUT)/target
-	@$(VECHO) "#define TARGET_RISCV 1" > $@
-	@ln -s $(PWD)/$(SRCDIR)/riscv-codegen.c $(SRCDIR)/codegen.c
+# set ARM by default
+ifeq ($(strip $(ARCH)),riscv)
+ARCH = riscv
 else
-	@$(VECHO) "$(ARM_EXEC)" > $(OUT)/target
-	@$(VECHO) "#define TARGET_ARM 1" > $@
-	@ln -s $(PWD)/$(SRCDIR)/arm-codegen.c $(SRCDIR)/codegen.c
+ARCH = arm
 endif
-	@$(VECHO) "Target machine code switch to %s\n" "$$(cat out/target | sed 's/.*qemu-\([^ ]*\).*/\1/')"
+
+TARGET_EXEC = $($(shell echo $(ARCH) | tr a-z A-Z)_EXEC)
+
+config:
+	$(Q)ln -s $(PWD)/$(SRCDIR)/$(ARCH)-codegen.c $(SRCDIR)/codegen.c
+	$(call $(ARCH)-specific-defs) > $@
+	$(VECHO) "Target machine code switch to %s\n" $(ARCH)
 
 $(OUT)/tests/%.elf: tests/%.c $(OUT)/$(STAGE0)
 	$(VECHO) "  SHECC\t$@\n"
diff --git a/mk/arm.mk b/mk/arm.mk
index 0195288..5dbddd3 100644
--- a/mk/arm.mk
+++ b/mk/arm.mk
@@ -6,3 +6,10 @@ ARM_EXEC = echo WARN: unable to run
 endif
 
 export ARM_EXEC
+
+arm-specific-defs = \
+    $(Q)$(PRINTF) \
+" \#define ARCH_PREDEFIND \"__arm__\" /* defined by GNU C and RealView */\n\
+  \#define ELF_MACHINE 0x28 /* up to ARMv7/Aarch32 */\n \
+  \#define ELF_FLAGS 0x5000200\n \
+"
diff --git a/mk/riscv.mk b/mk/riscv.mk
index 19fe194..7c0fb62 100644
--- a/mk/riscv.mk
+++ b/mk/riscv.mk
@@ -5,4 +5,11 @@ $(warning "no qemu-riscv32 found. Please check package installation")
 RISCV_EXEC = echo WARN: unable to run
 endif
 
-export RISCV_EXEC
\ No newline at end of file
+export RISCV_EXEC
+
+riscv-specific-defs = \
+    $(Q)$(PRINTF) \
+" \#define ARCH_PREDEFIND \"__riscv\" /* Older versions of the GCC toolchain defined __riscv__ */\n\
+  \#define ELF_MACHINE 0xf3\n\
+  \#define ELF_FLAGS 0\n\
+"
diff --git a/src/cfront.c b/src/cfront.c
index af5f8d6..6da5206 100644
--- a/src/cfront.c
+++ b/src/cfront.c
@@ -2145,15 +2145,8 @@ void parse_internal()
     add_block(NULL, NULL);    /* global block */
     elf_add_symbol("", 0, 0); /* undef symbol */
 
-/* architecture defines */
-/* FIXME: use #ifdef ... #else ... #endif */
-#ifdef TARGET_ARM
-    add_alias("__arm__", "1"); /* defined by GNU C and RealView */
-#endif
-#ifdef TARGET_RISCV
-    /* Older versions of the GCC toolchain defined __riscv__ */
-    add_alias("__riscv", "1");
-#endif
+    /* architecture defines */
+    add_alias(ARCH_PREDEFIND, "1");
 
     /* binary entry point: read params, call main, exit */
     ii = add_instr(OP_label);
diff --git a/src/elf.c b/src/elf.c
index 27bfa12..e266aed 100644
--- a/src/elf.c
+++ b/src/elf.c
@@ -82,13 +82,7 @@ void elf_generate_header()
     elf_write_header_int(0);          /* EI_PAD: unused */
     elf_write_header_byte(2);         /* ET_EXEC */
     elf_write_header_byte(0);
-/* FIXME: use #ifdef ... #else ... #endif */
-#ifdef TARGET_ARM
-    elf_write_header_byte(0x28); /* ARM (up to ARMv7/Aarch32) */
-#endif
-#ifdef TARGET_RISCV
-    elf_write_header_byte(0xf3); /* RISC-V */
-#endif
+    elf_write_header_byte(ELF_MACHINE);
     elf_write_header_byte(0);
     elf_write_header_int(1);                          /* ELF version */
     elf_write_header_int(ELF_START + elf_header_len); /* entry point */
@@ -96,14 +90,7 @@ void elf_generate_header()
     elf_write_header_int(elf_header_len + elf_code_idx + elf_data_idx + 39 +
                          elf_symtab_index +
                          elf_strtab_index); /* section header offset */
-/* flags */
-/* FIXME: use #ifdef ... #else ... #endif */
-#ifdef TARGET_ARM
-    elf_write_header_int(0x5000200); /* ARM */
-#endif
-#ifdef TARGET_RISCV
-    elf_write_header_int(0);
-#endif
+    elf_write_header_int(ELF_FLAGS);
     elf_write_header_byte(0x34); /* header size */
     elf_write_header_byte(0);
     elf_write_header_byte(0x20); /* program header size */

Warning: the patch did not work yet, but it should be fairly expressive.

qemu: uncaught target signal 4 (Illegal instruction) - core dumped

I tried to have fun with instructions with README.md, but got error like the following,

$ make
  SHECC	out/shecc-stage2.elf
qemu: uncaught target signal 4 (Illegal instruction) - core dumped
make: *** [Makefile:59: out/shecc-stage2.elf] Illegal instruction (core dumped)

Then I made a simple test,

Passed to execute native shecc binary on host.
Failed to execute shecc-stage2.elf with qemu-arm shecc-stage2.elf hello.c command.

Determine the factors contributing to unexpected slowdowns during self-hosting

This project has the capability for self-hosting, transitioning from native compilation in stage 1 to stage 2. Given that shecc is composed in ANSI C, it allows for the use of any compiler across various platforms for the initial compilation. This process is further facilitated through the use of RISC-V and/or Arm emulators for bootstrapping. Nevertheless, we are experiencing unforeseen delays during the self-hosting process, particularly in the execution times of stages 1 and 2. Our aim is to conduct a thorough investigation into the causes of these slowdowns.

We can use uftrace to identify the sources of slowdowns.

Support mmap on shecc

I am trying to implement mmap2 for shecc, but I found that there are no enough registers for calling the syscall based on test_mmap.s. So, I modified OP_syscall in src/riscv-codegen.c as following,

case OP_syscall:
            emit(__addi(__a7, __a0, 0));
            emit(__addi(__a0, __a1, 0));
            emit(__addi(__a1, __a2, 0));
            emit(__addi(__a2, __a3, 0));
+           emit(__addi(__a3, __a4, 0));
+           emit(__addi(__a4, __a5, 0));
+           emit(__addi(__a5, __a6, 0));
            emit(__ecall());
            if (dump_ir == 1)
                printf("    syscall");
            break;

On the other hand, to test the syscall, I modified the malloc help function as following:

block_meta_t *__malloc_request_space(int size)
{
    char *brk;
    block_meta_t *block;

+   void *tmp = __syscall(__syscall_mmap2, 0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

    brk = __syscall(__syscall_brk, 0);

After the mentioned modification, the segmentation fault is occurred when using qemu to execute the code.

$ make VERBOSE=1
out/inliner lib/c.c out/libc.inc
cc -o out/src/main.o -O -g -ansi -pedantic -Wall -Wextra -c -MMD -MF out/src/main.o.d src/main.c
cc out/src/main.o -o out/shecc
out/shecc --dump-ir -o out/shecc-stage1.elf src/main.c > out/shecc-stage1.log
chmod a+x out/shecc-stage1.elf
/usr/bin/qemu-riscv32 out/shecc-stage1.elf -o out/shecc-stage2.elf src/main.c
make: *** [Makefile:76: out/shecc-stage2.elf] Segmentation fault (core dumped)

Then, I dump the disassembly code to check. The caller and callee are shown below:

Caller:

   123e4:	ff010113          	addi	sp,sp,-16
   123e8:	00a12023          	sw	a0,0(sp)
   123ec:	0de00513          	li	a0,222
   123f0:	00000593          	li	a1,0
   123f4:	00001637          	lui	a2,0x1
   123f8:	00060613          	mv	a2,a2
   123fc:	00300693          	li	a3,3
   12400:	02200713          	li	a4,34
   12404:	fff00793          	li	a5,-1
   12408:	00000813          	li	a6,0
   1240c:	c61fd0ef          	jal	ra,0x1006c

Callee:

   1006c:	ff010113          	addi	sp,sp,-16
   10070:	00812623          	sw	s0,12(sp)
   10074:	00112423          	sw	ra,8(sp)
   10078:	00010413          	mv	s0,sp
   1007c:	00050893          	mv	a7,a0
   10080:	00058513          	mv	a0,a1
   10084:	00060593          	mv	a1,a2
   10088:	00068613          	mv	a2,a3
   1008c:	00070693          	mv	a3,a4
   10090:	00078713          	mv	a4,a5
   10094:	00080793          	mv	a5,a6
   10098:	00000073          	ecall
   1009c:	01040113          	addi	sp,s0,16
   100a0:	ff812083          	lw	ra,-8(sp)
   100a4:	ffc12403          	lw	s0,-4(sp)
   100a8:	00008067          	ret

It looks like a correct implementation and is similar to the test_mmap.s, but it is failed. I have no idea how to solve it.

Fail to write global variable

Hi, I am trying to add feature: global variable initialization.
I modify some code and it seems good when I dump IR, following is simple sample

...
0x000122d4         x1 := 44
0x000122d8         x0 = &a
0x000122e0         *x0 = x1 (4)
0x000122e4     main:
0x000122fc         {
0x000122fc             x0 := &data[36]
0x00012304             x1 = &a
0x0001230c             x1 = *x1 (4)
0x00012310             x0 := printf() @ 978
0x00012314             x0 := 0
0x00012318             return (from main)
0x0001231c         }
0x0001231c         exit main

corresponding C code

int a = 44;
int main(int argc, char *argv[])
{
    printf("%d\n", a);
    return 0;
}

As I mention, it seems good for assigning initial value to a in IR, but the value printed is 0. This weird stuff stuck me a few day. Anything I ignore about writing value to global variable? Thanks!

how to compile

HI dear author,
It's truly a honor to write a letter to you, I'm building your project nowadays and found the error when building as following, I wonder if there is a chance that you know the reason? :)

thank you
best regards to you
William

Error on make: /bin/sh: line 1: 21530 Segmentation fault out/shecc --dump-ir -o out/shecc-stage1.elf

I have a mistake when doing make.

make
  SHECC	out/shecc-stage1.elf
/bin/sh: line 1: 21530 Segmentation fault      out/shecc --dump-ir -o out/shecc-stage1.elf src/main.c > out/shecc-stage1.log
make: *** [Makefile:71: out/shecc-stage1.elf] Error 139

Didn't support "#ifdef ... #else ... #endif"

As title, current shecc didn't support "#ifdef ... #else ... #endif".

Parse syntax for include macro in parser.c

Concern about code from line 3288 in parser.c,

    if (lex_peek(T_include, token)) {
        if (!strcmp(token_str, "<stdio.h>")) {
            /* ignore, we include libc by default */
        }
        lex_expect(T_include);

If the comment is valid, the lex_expect line should be inside the if block. As it is, stdio.h include lines will be processed by a lex_expect call.

If the comment is valid, here is a change that would put the lex_expect is put in an else block, and removing negation on the strcmp clause,

    if (lex_peek(T_include, token)) {
        if (strcmp(token_str, "<stdio.h>")) {
            /* ignore, we include libc by default */
        } else {
            lex_expect(T_include);
        }

For a coding question: about parser

I would like to ask, in parser, is "shecc" a directly generated instruction set linked list sequence? I don't seem to have found the relevant code for AST. If so, please point out.

If possible, may you comment on each structure and field in def.h?

Heartfelt thanks.

Implement basic optimizations

With the inclusion of SSA in the middle-end, it is now time to implement some common optimizations on the new SSA-based IR. These include constant folding, copy propagation, and dead code elimination.

Do we need to change how shecc evaluate expression?

The way recent shecc evaluating local expression is generating relative arithmetic instruction and calculating in run-time. I think we can change the scheme into evaluating in compile time.

Although it will increase compile time, there're still lots of benefits:

we can do more tricks(optimization) on expression because we know the characteristic(value) of the operand before run-time. e.g. quick modulo
no operand limit because we use stack. Also we can pass relative test bench in driver.sh(expr 210 "1 + (2 + (3 + (4 + (5 + (6 + (7 + (8 + (9 + (10 + (11 + (12 + (13 + (14 + (15 + (16 + (17 + (18 + (19 + 20))))))))))))))))))")

Refactor code generations

Both src/arm-codegen and src/riscv-codegen.c have some functions in common. The code generation can be refactored with the following changes:

split the shared functions/variables to new file src/codegen.c from src/{arm,riscv}-codegen.c
In the end of src/codegen.c, there should be a statement `#include "src/arch-codegen.c" which links to Arm or RISC-V code generation implementation.
The generated shecc executable file should be capable of showing its configurations such as the supported architecture and ABI.

libc: Drop brk in favor of mmap inside malloc/free

At present, malloc was implemented via brk system call, which is obsolete. Most malloc implementations use mmap system call instead. We shall drop the use of brk in favor of mmap while implementing maloc and free.

Check comprehensive list for system calls via https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md#cross_arch-numbers

#define __syscall_mmap 192
#define __syscall_munmap 91

Declare variables where needed

In C99, it is possible to declare variables precisely where they are needed, a feature also present in C++. Variables can also be declared within the control structure of a for loop, a feature introduced in C99:

 for (int x = 0; x < 10; x++) {  /* This is permitted starting with C99 */
    ...
}

The C99 standard says: (6.8.5.3 The for statement)

for ( clause-1 ; expression-2 ; expression-3 ) statement
behaves as follows: The expression expression-2 is the controlling expression that is evaluated before each execution of the loop body. The expression expression-3 is evaluated as a void expression after each execution of the loop body. If clause-1 is a declaration, the scope of any variables it declares is the remainder of the declaration and the entire loop, including the other two expressions; it is reached in the order of execution before the first evaluation of the controlling expression. If clause-1 is an expression, it is evaluated as a void expression before the first evaluation of the controlling expression.

This enhancement simplifies the code by allowing variable declarations closer to where they are used, thereby improving readability and maintainability. shecc should support this feature.

jserv / shecc Goto Github PK

shecc's Issues

Description

Related issues

How to reproduce

Related issues

Recommend Projects

Recommend Topics

Recommend Org