andrew-johnson-4 / lambda-mountain Goto Github PK
View Code? Open in Web Editor NEWA Compiler in 85K Instructions (3K SLOC)
License: MIT License
A Compiler in 85K Instructions (3K SLOC)
License: MIT License
--tokenize
--parse
--compile
I use nano...
It would be nice to have global namespace view vs local function definitions views in an editor. An integrated library search and project build manager would be nice too.
Chain and Apply are the most primitive functional operations. They have default behaviors but there is also specialization to override that. Explicit vs Implicit arguments make this interesting too.
Implicit <-> Explicit
^ ^
chain | | apply
v v
Implicit <-> Explicit
After all the sugar gets taken off, f x
just calls (chain implicits f x)
Architectures:
OS:
It would be nice to provide a frontend to search for existing libraries with functions that provide some functionality. This could be links to github repos that get downloaded or crates that get added as dependencies.
This would be most useful after types and ad-hoc polymorphism get implemented to avoid conflicts in the global namespace.
74 / 77 passing.
eval-soft
can just echo parsed input for now. eval-hard
can be eager reduction for simplicity.
The chain rule for functions applies the function, otherwise it makes a Cons cell.
It would be nice if users could define custom chain rules.
( (App (: f (->(x y))) x) ... apply function ... )
custom chain rules go here
( (App l r) ... yield cons ... )
(foreach (1 2 3) (
(local a)
(set a 123)
a #sometimes this is still Nil
))
(f 1 2 3)
is a lot easier to read than (f (1 2 3))
REPL
lambda_mountain repl
> p := λx. x x
> p := λx x. x
> p a a
a
>
match 'abc (
(('ab remainder) ...)
)
Bring this story to life:
rhs := [a-z][_a-zA-Z0-9]* //Variable
| ( rhs* ) //Function Application
| λ rhs* . rhs* //Lambda Function
| [^ ]+ //Literal Value
binding := [a-zA-Z0-9]+ [:] [=] rhs \n
program := binding*
example recursive descent parser:
nibble-lhs := λ"a" a. a
nibble-rhs := λa "a". a
nibble-nonterminal := λ(nibble-lhs a) b. a b
nibble-empty := λ. "a"
nibble-tokens := λ(ident n) (eq _) (rhs r). (Assign n r)
asm!( mov( $0 %r12 ) )
is different from just
mov( $0 %r12 )
The asm! rule has it's own namespace and calling conventions.
The example should really just work...
Codegen is an implicit dependency in the build process. Alternative targets can be supported as either --target [target]
if the target is bundled with the LM compiler, or alternatively with a --assembler [assembler.lm]
arbitrary custom assembler.
The assembler must define a function assemble(options: S, program: S)
which will then finish the compilation. The LM compiler in this case will still perform the functions of a parser, preprocessor, and build tool.
The current compiler wastes too much memory to bootstrap.
rewrite
expr := expr-mul '+' expr
expr := expr-mul '-' expr
as
expr := expr-mul |> '+' expr
|> '-' expr
to avoid exponential explosion in descent
The program should accept a special symbol ::infer
that can be used for type inference. This term will be applied with a Type Context and a Term, returning a new Term.
f := λ(: x X). x;
f := λ(: y Y). y;
f := (λ..args.
let (args a) = kw args (λ(A a). a)
a
)
Only two registers are really needed to represent a cons cell. It might be an interesting exercise to try to generate a 2,3,4 register compiler version to see how flexible the register assignment strategy can be.
"I want to represent an S-Expression with 2 registers. Generate some code for me."
✓ Bootstrap Compile a CLI
✓ Bootstrap Compile a CLI that dumps S-Expression fragments
✓ Bootstrap Compile a CLI that normalizes then dumps S-Expression fragments
✓ Bootstrap Compile a CLI that parses, normalizes, then dumps S-Expression fragments
Bootstrap Compile a CLI that parses, normalizes, then assembles S-Expression fragments
Define a generic interface to Cons/Atom allocators. This is necessary to bring the runtime down to zero implicits.
there are roughly 15 symbols in the data section that are required by the runtime. This # should be 0.
Ignore newlines that appear inside parens
include file.lm;
(side effect 1)
(side effect 2)
return
is better than
(tail(
(side effect 1)
(side effect 2)
return
))
tests.boostrap/
thetest.lm #test to run
thetest.lm.config #compile time flags
thetest.lm.args #arguments to runtime
thetest.lm.out #expected stdout
thetest.lm.err #expected stderr
testest.lm.errno #expected exit code
tests.release/
*
There are some test cases on the dev branch for this.
This may have caused a memory issue, not sure.
Add a --gradual
flag for the compiler to enable implicit casts into/out-of S-Expression representation.
let
or other functions that manipulate volatile locals like registers or stack can't be called with normal operational calling conventions. The code actually needs to be expanded inline like a macro, but that makes things like labels and local jumps potentially problematic. There are a host of problems here.
I think the correct solution is type-sensitive specialization, but that isn't available in the bootstrap compiler.
This could help pinpoint corruption and help push this language from low-level to mid-level of abstraction.
(local a)
(set a 1)
(foreach (2 3 4) (
(local a)
(set a 2)
(print-s a)
))
Typed LM Codegen is fairly different than the bootstrap release. I have pared down the bootstrap compiler to just tokenization and parsing. Now we should introduce typechecking before reconstructing the codegen. Also some of the optimizations from the bootstrap book can reduce memory use by 50%.
Currently the default return is ()
.
Generating assembler as an intermediate is annoying and unnecessary. Assembly should only be used for debugging.
Default Atom Representation should be (char* offset_start offset_end)
instead of just char*
.
eliminate argv construction if it is never referenced.
LM can have an internal IR representation of assembler code. It would also be good to be able to read and write from different syntaxes. This starts to address the MxN problem of multiple frontends, multiple target platforms.
Internal Representation
There are 100s of App
s in the compiler but only one is necessary.
constraint (BTree a) (Branch a a) (Leaf a)
f := λ(: x Int). + x x
f (f 1)
It would be nice to have a reference implementation.
compile the LM bootstrap to assembly and attach as another bootstrap option.
"abc"[0..2]
is better than
"abc".substring(0,2)
it also comes with free file position tracking.
let
is a macro that manipulates the stack frame, how should this be modelled? Assumptions about the registers and memory model are implicit parameters to all functions.
A simpler example would be to model the type signature of push %ax
.
Turn typed assembly into PunC expressions. LM can provide the parser.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.