Git Product home page Git Product logo

squint's Introduction

Curated collection of tiny C compilers, all written in C:

  • MarcFeeleyTinyc -- ( 284 SLOC )
    • lex -> parse to ast -> compile ast to bytecode -> execute on VM
  • rswier/c4 -- ( 499 SLOC )
    • Can interpret itself.
  • EarlGray/c4 -- (550) SLOC
    • Github won't let me fork this repo since its derived from c4, but it is a tiny x86 JIT compiler
  • jserv/amacc -- ( 2366 SLOC )
    • Generates and executes ARM assembly by JIT compile or creating ELF executable. Can compile itself. This is the base compiler forked for Squint.
  • HPCguy/Squint -- (3250 SLOC compiler, 3750 SLOC optimizer)
    • JIT + ELF + shared object optimizer. Near gcc -O1 performance, and surprisingly close to -O3 for some problems. Barely even started optimizing, so this could get interesting.

A Python to Linux x86 compiler (that can call libc functions), used as an example, not to support all of Python:

  • HPCguy/pyast64
    • forked from BenHoyt/pyast64 and extended for C language support.

General:

  • ๐Ÿ‘€ Iโ€™m interested in High Peformance Computing. I sat on several physics application teams in my 20 year career and I'm an expert at parallel efficiency. (See https://docs.google.com/viewer?a=v&pid=sites&srcid=bGJsLmdvdnxwYWRhbC13b3Jrc2hvcHxneDo2NWNiM2Y2MDg5ZTZiZTcy )
  • ๐ŸŒฑ Iโ€™m currently learning to write a compiler. I know what features are missing in HPC languages and compilers, but have no experience writing a compiler.
  • ๐Ÿ’ž๏ธ Iโ€™m looking to collaborate on a new language for HPC efficiency. Almost every parallel language has been written by academics, not HPC app developers. This has been a mistake. I know how to write a language with parallel semantics but with a sequential look and feel. I was one of the two original co-developers of the RAJA framework used by the Department of Energy, but due to limitations in the C language, that framework and any other like it will never be efficient. Converting everything to pointers before applying optimizations is a mistake that can only be corrected at the language level by jettisoning the system programming semantics found in the C language definition.

squint's People

Contributors

hpcguy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

squint's Issues

literal constant optimization bug

The function Squint.c :: rename_register1() does literal constant optimizations, and is currently unsafe. I suggest commenting out that optimization until this ticket is closed.

This bug might only occur for chained literal constant assignment:

a = b = c = 0.0;

In the case of chained literal assignment, a workaround is likely as easy as:

a = 0.0;
b = 0.0;
c = 0.0;

This bug gives me an excuse to refactor how I handle literal constants in general (which are tricky on ARM since they are embedded in inactive parts of the instruction stream). Sorry that I am not doing a more direct fix, but I do not yet have any heavy users of Squint.c .

Since I am currently in the midst of a serious medical issue, I may not have the energy to get to this for at least several more weeks.

Fix branching optimization bug

Some gotos are being optimized away.

Create a stress test for branching and goto as a means to fix this and other jump related bugs.

Improve testing harness

The testing harness catches bugs, but it needs to hard fail more often.
Also, each test pass should create unique executable names rather than the current name sharing.

Fix critical floating point bugs

I just created a shocktube problem to start applying floating point optimizations.

There are some critical bugs I need to fix in the floating point immediately.

Add support for true forward declarations

Even though this is a one pass compiler, it is possible to backpatch function call addresses
to create true forward declarations. The current "function prototype" ability in the compiler
is a poor man's forward declaration that backpatching can fix.

Add support for typedef

Very simple to implement. A type is set up as an Id using the current type machinery, and the tk field of the Id is set to Typedef. After that, the use of the new typename just clones the Id type information from the old Id into a new Id, and possiblly allocates space.

Add support for float datatype

This is best implented in four branches:
(1) Enable float constants, float memory movement, float-register <-> memory transfers, and calling functions that return floats.
(2) Enable float ABI for parameter passing in function calls (harder than working with simple register return value).
(3) Enable math operations on floats.
(4) Enable mixed type expressions with automatic type conversion.

Simple program breaks compiler

int main() {
    int a = 0;
    if (1) {
        int b;
        printf("%d\n", a);
        int c;
        b = 1;
        a += b;
        printf("%d\n", a);
    }
}

Results in

0
0
1

Clearly it should not print 3 numbers!

Implement backend code generation in one shared library per architecture

Implement codegen() as shared libraries to support multiple architectures. codegen() is a self contained function that converts an an input IR array into a .text segment that can be executed.

There are a few spots in the compiler where a tiny block of assembly code is genertaed directly into an array, and those can be broken out as other functions in the shared library.

Fix Printf varargs ABI

Printf requires that the stack and stack arguments be 8-byte aligned when you call printf. This is harder to achieve than it sounds because if you adjust the stack alignment before printf is called, there is not way to know if it had been forcibly aligned after the call, when you have to pop stack arguments.

Fix bug introduced in pull request #51

This code should return a 0 value to the shell. It returns '4'.

int g;

int effect()
{
g = 1;
return 1;
}

int main()
{
int x;
g = 0;
x = 0;
if(x && effect())
return 1;
if(g)
return 2;
x = 1;
if(x && effect()) {
if (g != 1)
return 3;
} else {
return 4;
}
g = 0;
x = 1;
if(x || effect()) {
if(g)
return 5;
} else {
return 6;
}
x = 0;
if(x || effect()) {
if (g != 1)
return 7;
} else {
return 8;
}
return 0;
}

Switch to stack based symbol table management

The old c4 symbol manager is buggy. Switch to a stack for symbol management.

Stack segments: keywords, globals, parameters, nested scope.

Create a scope marker whenever a block scope is creaked, and delete all
symbols down to the scope marker when block scope is exited.
Keep a tos marker for last symbol created.
Scan downward through the stack when doing id lookups.

In the tokenizer, when tokenizing ids, do this:

notkeyword = 0;
while ((*p >= 'a' && *p <= 'z') || (++notkeyword, *p >= 'A' && *p <= 'Z') || *p == '_') { ... }

if (notkeyword) /* handle id /;
else /
handle keyword */ ;

This will not only fix all the bugs with the c4 design, but will also increase performance.

Finish inline keyword implementation

I am forced to add the inline keyword to the public Squint compiler due to a botched 'git stash' and merge on my part.
This was work in progress for the HPC version of MC, and was not supposed to be here.

Now that the cat is out of the bag, I have to let it ride here on the public branch of the compiler.

Note that what I have committed so far only has partial functionality, since it was in mid development.
It only works for a limited subset of void functions at the moment.

Of note: the inline keyword in the MC compiler can be applied in two ways -- either to the function or the call site.
This allows for optimized specializations of functions. For example:

void dot(float *result, float *a, float *b, int len)
{
float sum = 0.0;
for (int i=0; i<len; ++i)
sum += a[i]*b[i];
*result = sum;
}

void ddot(float *out, float *x, int n)
{
inline dot(out, x, x, n); // note speicialization here -- MC optimizer cuts number of mem reads in half
}

Inlining can be nested up to 16 levels. Error messages follow the context through all inline levels.

Another mc bug?

This should either get it right, or spit out an error!

int a = 1 << 0, b = 1 << 1, c = 1 << 2;

int main() {
    printf("%d %d %d\n", a, b, c);
    return 0;
}

I get.

0 1 2

Should be 1 2 4

Also, unrelated, standard C does not require main to have a return value even though it's declared as int.

apply_ptr_cleanup optimization is buggy

This is the most powerful optimization in Squint, but is
currently buggy due to some missing dependency checks.

Since this repository is a Work In Progress,
and since all repository tests are passing,
I am not going to disable apply_ptr_cleanup as I work to fix it.

To disable this optimization, go into squint.c and comment out
the only call to the apply_ptr_cleanup function near the end of the file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.