gluon-lang / gluon Goto Github PK
View Code? Open in Web Editor NEWA static, type inferred and embeddable language written in Rust.
Home Page: https://gluon-lang.org
License: MIT License
A static, type inferred and embeddable language written in Rust.
Home Page: https://gluon-lang.org
License: MIT License
This seems to be the general movement of programming languages of late. It's also probably clearer, and easier for newcomers to understand.
A new functional language with a GC is not a earth shattering. If you can do static analysis to figure out the lifetimes without having to specify them at all, this would be a breakthrough. Also see: https://github.com/carp-lang/Carp
Currently the following features are needed to build the library.
Vec
for memory allocation&[T]
Currently all patterns always compiled as Split which means that large objects such as the prelude push a lot of values to the stack when they are pattern matched. Often though only a few of them are actually used. For these cases the compiler should produce code which only takes the necessary arguments from the object.
It might be nice to provide named functions for interfaces first, then have the operators as shorcuts. So for example, (>>=)
would become flat_map
(I personally prefer this over bind
, heh), and make_Monad
would define the operator based on that.
At the moment module polymorphism is kind of clunky. Whilst our priority is more focused on polishing what we already have, it might be good to note down our thoughts here.
Some references:
The import.rs module would should be extended to look for an environment variable (maybe GLUON_PATH
?) and use that for import lookup as well.
Would be nice to have some kind of place for free-form, asynchronous discussions that don't really fit into github issues. Perhaps Gitter? http://gitter.im/ - we could put a badge to it on the README.
This is causing timeout woes with travis and makes things hard for users on stable. As discussed on https://github.com/Marwes/gluon/issues/33#issuecomment-233157725 and https://github.com/Marwes/gluon/pull/66#issuecomment-234803447 we could possibly consider switching to a parser generator like rustpeg of LALRPOP.
As discussed in https://github.com/Marwes/gluon/pull/35#issuecomment-233140511:
I started some work to replace the linenoise dependency with rustyline. There shouldn't be any problems porting over any changes since it is mostly a rust implementation of linenoise but it also has a better completion api which should make it possible to add tab-completion in the REPL!
Example:
>> :k Option
* -> *
>> :k Monad
(* -> *) -> *
The use of the StringError variant is rather lazy and does not provide good error messages. If all uses of it are replaced by adding additional variants to the TypeError variants those errors can have the data they need to explain the error properly.
Currently there is no good way of creating a new record which has most of the fields of an existing record with a few of them modified. Adding functional record update ala Haskell or Rust would make this easier in some cases.
Functiona record update may be a bit limiting in some/many cases so it may be worth investigating a more general way of updating data structures (something like lenses).
For example:
> { (+) = \x y -> "squiggly squiggle" }
<top>:Line: 1, Column: 1: Expected the following types to be equal
Expected: { +: 1476 -> 1477 -> String }
Found: { ord: std.prelude.Ord 1479, +: 1479 -> 1479 -> 1479, -: 1479 -> 1479 -> 1479, *: 1479 -> 1479 -> 1479, /: 1479 -> 1479 -> 1479, negate: 1479 -> 1479 }
1 errors were found during unification:
Types do not match:
Expected: { +: 1476 -> 1477 -> String }
Found: { ord: std.prelude.Ord 1479, +: 1479 -> 1479 -> 1479, -: 1479 -> 1479 -> 1479, *: 1479 -> 1479 -> 1479, /: 1479 -> 1479 -> 1479, negate: 1479 -> 1479 }
{ (+) = \x y -> "squiggly squiggle" }
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From writing the REPL I get the feeling that a strict language does not work that well with monadic action once a few actions are chained. Adding better syntax for monadic actions could maybe make it convenient enough but as the language isn't pure anyway it is likely better to remove the IO monad for now.
It is probably a far-future issue to actually formalize the type system, however, more pressing is that there is not a good way of determining whether a program should typecheck or not. This is mainly because the fact that type aliases can be used as type lambdas which makes the current typechecking undecidable.
I want to investigate whether it is possible to restrict type aliases in a way that avoids the need of higher order unification (which is not decidable) while not going as far as Haskell does when restricting aliases (as that would require the ability to define new distinct types, something I think would be interesting to avoid TODO explain reasoning).
--allowed to use partially
type Error = Either String
--Not allowed to use partially
type IntEither e = Either e Int
Large types such as the prelude should not be printed in their entirety. See https://github.com/Marwes/gluon/issues/58#issuecomment-234927599.
Type `{
Bool = | False | True,
Ordering = | LT | EQ | GT,
Option a = | None | Some a,
Result e t = | Err e | Ok t,
List a = | Nil | Cons a (std.prelude.List a),
Monoid m = { append: m -> m -> m, empty: m },
... 'x' fields omitted
}` does not have the field `Option Int`
Currently the Function
variant has holds a Vec<Type>
for function arguments but it is always assumed that the vector only has one element as functions are curried and are always treated as 1-argument functions.
The variant should either just hold one argument type, mirroring how the type checker sees it, or the code should take advantage of the Vec
to store multiple arguments (possibly for efficency's sake).
This seems to be what all the cool kids are doing! It is also much easier for people to understand vs Haskell's (.)
and ($)
:
/// Backward function application, where `f <| x == f x`
let (<|) f x : a -> (a -> b) -> b = f x
/// Forward function application, where `x |> f == f x`
let (|>) x f : (a -> b) -> a -> b = f x
/// Function composition, where `f >> g == (\x -> f (g x))`
let (>>) f g : (a -> b) -> (b -> c) -> a -> c = \x -> f (g x)
/// Function composition, where `f << g == (\x -> g (f x))`
let (<<) f g : (b -> c) -> (a -> b) -> a -> c = \x -> g (f x)
Seeing as we are thinking in higher kinds, I wonder if we could define these in terms of Applicative
and Category
?
Currently the location of a type error is reported as being in the same place as the expression it occured in. This location can be rather misleading in many common cases.
Type errors in function calls should be reported at the argument rather than the call.
Current:
map 1 Nil
^~~~~~~~
Better version:
map 1 Nil
^
Type errors in patterns should be reported at the patterns (or expressions) location rather than the enclosing expression
Expression:
let { x, y } = 1 in ...
Current:
let { x, y } = 1 in ...
^~~~~~~~~~~~~~~~~
Better version:
let { x, y } = 1 in ...
^~~~~~~~
Would be super groovy if you could search for functions in-scope with :s
. Here is what you can do in the Idris repl:
Idris> :search Monad f => f a -> f b -> f a
< Prelude.Applicative.(*>) : Applicative f => f a -> f b -> f b
< Prelude.Applicative.(<*) : Applicative f => f a -> f b -> f a
> Prelude.List.intercalate : List a -> List (List a) -> List a
Given a separator list and some more lists, produce a new list
by placing the separator between each of the lists.
> Prelude.List.(++) : List a -> List a -> List a
Append two lists
> Language.Reflection.Elab.Prim__Try : Elab a ->
Elab a -> Elab a
> Prelude.List.(\\) : Eq a => List a -> List a -> List a
The \\ function is list difference (non-associative). In the
result of xs \\ ys, the first occurrence of each element of ys
in turn (if any) has been removed from xs, e.g.,
> Prelude.List.merge : Ord a => List a -> List a -> List a
Merge two sorted lists using the default ordering for the type
of their elements.
> Prelude.List.union : Eq a => List a -> List a -> List a
Compute the union of two lists according to their Eq
implementation.
> Prelude.while : IO' l Bool -> IO' l () -> IO' l ()
Loop while some test is true
List of some things that are needed to make it easier to debug programs.
After Rust 1.9 is released is_char_boundary
will be stabilized and this is easy to implement.
Either an error or a warning should be triggered when not all alternatives are matched on.
As no form of overloading currently exist these basic functions require explicit importing like let { (+), (-), (*) } = prelude.num_Int in 2 + 2
. This is obviously not so nice so some sort of resolution for this problem will be needed. Following is a list of possible ways to resolve this.
Currently only "gluon types" (records and enumerations) can be matched on in case expressions. Allowing userdata types to be matched on would let them integrate better.
Currently arrays are just stored as Array<Value>
which wastes a lot of space as the tag information in the enum is duplicated. A better way would be to have multiple array types (Array, Array, Array and maybe even Array if a byte type were added).
Though the idea is straightforward the implementation may have some subtleties. For instance consider the following code
let twice x = [x, x]
twice 1
twice { x = 1 }
When constructing an array in the twice
function the tag of x
needs to be inspected so the correct array can be constructed. A similar problem comes when indexing an array where the code has to check the type of the array to construct the correct Value
variant. Theses issues aren't to complicated but there may be other problems hiding as well.
case ... of
expressions are a hold over from the language's roots in Haskell but currently the syntax has more in common with F# and Ocaml which uses match ... with
. Rust also uses match
but that could be both an argument for (to make it more similiar) or against (the syntax still differs so having case
instead could be a good distinction.
Macros currently just look like functions import "test.glu"
which can easily lead to confusing behaviour as Macros override any other bindings (essentially making macros reserved key words). Rust ends all macros with a ! which makes macros distinct and does essentially namespaces macros. Copying this behaviour may be a good idea and shouldn't require to much effort to implement
Anytime Type::int
, Type::float
, Kind::star
, is called a new reference count type or kind is allocate which is rather wasteful. As these types are quite common they could be cached in the typechecker (and else where) so that the same pointer could be reused. The kindchecker already does some of this but in the typechecker this is not done.
https://github.com/Marwes/embed_lang/blob/master/check/src/kindcheck.rs
PrimitiveEnv might be possible to extend for this (though it was originally created to retrieve a canonical boolean).
() should not need allocation
To allow the interpreter to be used safely in multiple threads the interpreter needs to implement Sync and in turn this means that all values it owns also needs to be Sync. This can be problematic though as this requires mutable data to either just use atomic operations or using Mutex
/RwLock
which has more overhead than Cell
and RefCell
.
A possible way of allowing this is to create two ways of spawning gluon threads. A sync thread would work exactly as it does now. A non-sync thread on the other hand is allowed to contain non-sync data but it has some additional restrictions to make it safe.
// Thread does not implement `Sync` or `Send by default`
impl Thread {
fn new_sync_thread() -> RootedSyncThread;
fn new_thread() -> RootedThread;
fn new_child_thread(&self) -> RootedThread;
// A `&Thread` can be converted into a thread implementing `Sync` if it was spawned as a sync thread
fn to_sync_thread(&self) -> Result<RootedSyncThread>;
// It is always possible to convert a `Thread` into a `RootedThread` same as now
fn to_rooted_thread(&self) -> RootedThread;
}
impl RootedSyncThread {
fn new_child_sync_thread(&self) -> RootedSyncThread;
}
Restrictions:
I don't feel confident about this being all restrictions necessary so I will leave this as a tracking issue in which I can consolidate more data before actually starting on an implementation.
I reckon it would be nice to make Monad
inherit from Applicative
, and to remove return
in favor of pure
. In order to do that however, we need to implement the missing interfaces for the State
and Writer
types.
At the moment the Gluon's syntax overloads the (->)
operator for three things:
It would be nice to use (=>)
for the latter two cases to disambiguate this, and to leave us open to unifying the type and value syntaxes in the future.
For example:
type Functor f = {
map : (a -> b) -> f a -> f b
}
let functor_Option : Functor Option = {
- map = \f x ->
- match x with
- | Some y -> Some (f y)
- | None -> None
+ map = \f x =>
+ match x with
+ | Some y => Some (f y)
+ | None => None
}
Currently there are quite a lot of different representation in the Type enum this leads to complications in unification (and elsewhere) as types that should be equivalent can have multiple representations. To some extent these different representations are necessary either for convenience or for efficiency (Function(Int, Float)
is a lot nicer to work with than App(App(->, Int), Float)
).
Currently the App
variant only exists to avoid allocation a one element vector for single argument constructors which are very common but with smallvec (or equivalent) the App
variant could be completely superseded by Data
.
Other simplifications may be to replaceFunction(a, b)
with Data(-> [a, b])
and Array(a)
with Data([], [a]
.
Marking this as beginner as this change (though maybe a bit tedious) should just involve replacing all uses of Type::App(a, b)
with Type::Data(a, [b])
.
embed_lang
is just a working title, need to think of a better name.
As discussed on /r/rust, I've always found that typed declarations in MLs get long fast. Leads to people writing code that relies too much on global type inference, and is therefore is hard to figure out later. :(
let my_long_function : LongType -> LongerType -> LongestType -> Return
my_long_function x y z = ...
Currently worked on in the https://github.com/Marwes/embed_lang/tree/associated_types branch.
The idea is that record types can have both value and type fields so the prelude module might have the type defined as (making records similiar to first class modules in OCaml):
{ Int, Float, Option, Result, id, map, /* etc */ }
Types can then be brought into scope in the same way as variables
let { Int, num_Int = { (+), (-), (*) } } = prelude
and fac n : Int -> Int = ...
There is certainly going to be some issues with this approach which hasn't been considered yet but I think the basic idea is sound.
As mentioned in #56 each call to make_Monad
or similiar functions make a completely new set of functions for each invocation. These could be removed with optimizations but that is a long way away of being a solution. An alternative solution may be to add a bake
macro which functions as follows.
This:
let prelude = import "std/prelude.glu"
...
let monad = bake (prelude.make_Monad prelude.monad_IO)
Becomes this:
// This expression is evaluated first and stored in a global location
let prelude = import "std/prelude.glu"
prelude.make_Monad prelude.monad_IO
Then this expression can access the cached value.
let prelude = import "std/prelude.glu"
...
let monad = make_Monad_IO_global)
The bake macro would function a bit like import
in that it evaluates its argument once and caches the result in some globally accessible place. It becomes far trickier though as it would need to track from where all identifiers come from to be able to reconstruct a valid expression.
I may have been a bit to optimistic about this solution as taking it to the limit is more or less as much work as implementing a proper optimization to do this. It might be that this handles 90% of the cases though
This would allow the vm crate to be included and used separately though limited to only loading pre-compiled bytecode. For starters it doesn't need any checking for bytecode validity but eventually it would be useful to have safe loading as well.
As more features gets added to the REPL it would be nice to improve the testing for it. The REPL can however be tricky to test as one needs to be able to send commands to the terminal programmatically. In python I found the pexpect library which seems to be able to do this. It is a bit of a heavy weight solution to take on a python dependency just for some tests though and it would really nice if a library like pexpect could be ported to RUST. I don't have a good estimate to how much work such a port would be but it might be an interesting project to do (which would likely be useful outside of gluon as well)!
Currently the precedence for common operators such as + are predefined in the parser which makes it possible to at least write arithmetic without any surprises. This won't scale for user defined operators however so language construct needs to be added which binds precedence to an operator.
The current line of thinking is to attach fixity to bindings and let them propagate where they are used as long as it is not ambigious.
let { (+) } =
let lassoc 3 (+) = ....
in { (+) }
in /* assoc is still 3 for + here */
Nested block comments (e.g. /* /* foo */ */
) are my favorite feature from Rust. It would be nice if Gluon supports them too.
There is still a lot of parts of gluon which lack documentation and while I will do my best to recognize and improve it where I can, it would be very helpful to get suggestions for which part I should prioritize. Opening a new issue for any needed documentation improvements is often still a good idea but consider this issue a place to add suggestions which do not seem to warrant a separate issue.
Nested patterns are not implemented but due to how verbose code can be otherwise this would probably be necessary.
Running a garbage collect requires that all roots are traversed to make it safe to use. This is not enforced currently and I only rely on being very careful when working directly with GcPtr<T>
.
Servo has some ways of combating this so it may be worth copying some of that.
Currently only a location is stored for each AST node indicating the first character of the expression. If instead a span where stored for each expression the error messages could be significantly be improved.
The current way of storing a location for each node is probably not needed as well as for many expressions the location/span could be calculated from the data it contains.
As let bindings also serve as an import mechanic it would be nice if record patterns allowed for unpacking all fields avoiding the need to explicitly list every field.
//Experimental syntax
let { * } = prelude.num_Int in 1 + 2 * 3 - 4
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.