pilif0 / basilisk Goto Github PK
View Code? Open in Web Editor NEWLLVM frontend for my pet programming language
License: MIT License
LLVM frontend for my pet programming language
License: MIT License
A feature I enjoyed in my work with Kotlin was defining functions with an expression body (see Kotlin reference). These make the code cleaner and easier to read, and shouldn't be too hard to implement.
In essence, a function definition:
f(x, y) = x + y;
would be equivalent to:
f(x, y) {
return x + y;
}
More information (e.g. line number and character) should be included in tokens. This information should then be used in the parser to improve error reporting.
As discussed in the LLVM IR generation pull request (#4), there is a question of whether order of definitions in a program should matter. At the time of that pull request, making definition order not matter would produce ambiguities and would require handling of special cases. Therefore the decision was taken to make the order matter.
This issue is created to continuously examine when the change to definition order not mattering could be made, and what it would entail.
The language needs to have more data types than just doubles. I propose at least the following types based on the types in LLVM:
byte
, short
, int
, long
)float
, double
)It might also be good to implement further types while this is being done, such as vectors, and prepare for later implementation of pointers.
With these new types, it seems appropriate to expand the set of valid literals:
7
, 0b111
, 0x7
3.14
3.6e2
, 2e-4
true
and false
Furthermore, underscores should be allowed and discarded in literals, allowing more readable formatting, for example 0xffff_f0f0_abcd_1234
instead of 0xfffff0f0abcd1234
.
Unsigned versions of the types should also be considered. LLVM doesn't distinguish between signed and unsigned types, that is done when selecting an instruction to use.
If we regard a block of statements as a statement in itself, they can naturally be nested. This would allow better management of scope as well as prepare for implementation of conditional statements and loops.
Block
containing a set of statements to reflect thisCurrently error tokens are picked up by the parser as unexpected tokens (as error tokens are never expected). It would be better if a unified way of intercepting error tokens as added. Then they could be better reported, with possible recommendations based on the context. One of the main requirements for the solution is that it interferes as little as possible with the actual parsing, in order to keep the parser as easy to expand as possible.
Due to everything being a double, there needs to be a wrapper around the main function that converts the double it returns into the integer that the system expects. Once more data types are added this wrapper can be removed.
Currently all statements are required to end in a semicolon. While thinking about designs for other features, I started wondering whether this requirement is really necessary or could be dropped.
The semicolon currently works to divide statements. As one of my main principles for basilisk is that whitespace should not matter beyond dividing tokens, I can't replace it with a deadline and force each statement on a separate line. This would give meaning to whitespace and make it less suitable for formatting code without impacting function.
This issue is focused on simply dropping the semicolon and seeing what ambiguities are produced and if they can be reconciled. If it seems that all possible ambiguities can be easily solved, I would proceed with removing the requirement while keeping the option to include a semicolon there in case it is preferable for readability.
Multiple definitions of the same global variable currently produce multiple initializers, with the variable taking on the value of the last initializer for the full execution. This behaviour is unintuitive and should be removed. A good time to straighten this would be when adding more data types and differentiating variable definitions and assignments.
Currently all identifiers have to start with a letter. I think it would be good to expand this to allow identifiers starting with an underscore, which is often used in other languages.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.