Git Product home page Git Product logo

example-functional-compiler's Introduction

flub-example-compiler

This project is an example compiler frontend for a straightforward, statically-typed, functional language called Flub (functional Blub). It should illustrate an architecture for "real-world" language demands (ranges, error reporting, actual versus expected, etc.) while still being readable over a weekend.

Flub is syntactically a mashup between ML, Scala, and Haskell. It supports parametric polymorphism, higher-kinded types, type-in-type, recursive-let, and not much else.

Running the compiler

First, run the build:

spago build

Output formatted core:

node index.js example.flub

Normalize the "module" (the last declared expression):

node index.js example.flub -n

Walkthrough

Lexing

The lexer parses the non-recursive syntax (tokens). Lexical parsing is generally very straightforward, but is where the majority of parsing time is spent. Separating the language parser from the lexer, while optimizing the lexer, is a great way to improve overall parsing performance.

This lexer is not optimized, but illustrates how one can implement a lazy token stream with accurate source annotations for tokens. Lazy token streams have a nice property of only parsing on demand, while also sharing work if the language parser must backtrack.

Parsing

The syntax tree is the concrete language syntax. Every token is represented in the tree, and every token is annotated with positions, comments, and whitespace. This means our original source is fully represented in this tree, and can be printed back out exactly. This is a nice property since it means we can implement transformations on input syntax while potentially retaining the original formatting.

Elaboration

The elaborator takes the concrete syntax, and transforms it into an internal core language (known as System Fฯ‰). In Core, all polymorphism is represented as explicit type-abstractions and type-applications. Additionally, every term and argument is assigned an explicit type. Unification variables are used to track unknown types, which we later solve as part of type-checking.

While elaborating into Core, we also emit type-checking constraints, which assert equalities between types. At specific points, we will then invoke the solver, which processes these constraints and yields a substitution (solutions) for our unknowns. Applying the substitution will yield Core without any unknowns.

Solving

The solver takes pending equality constraints from the elaborator and attempts to solve them one-by-one through a process called unification. Unification walks structurally over the two operands, and when a unification variable meets some other type, records it as a solution. Types that are not equivalent result in an error.

Normalization

Normalization reduces a term until it can't be reduced anymore. Generally, System F is strongly normalizing (always reduces in finite time), but because we allow recursive bindings, it's possible for normalization to loop forever.

In this implementation, it's done through a process called normalization-by-evaluation. Terms are evaluated against their closure environment, and then syntax is reifed from that environment.

Printing

It's always nice to see our output in a human-readable manner. The most confusing part about pretty-printing is handling fixity and precedence such that parentheses are inserted in the appropriate places.

There are many ways to tackle this problem, but one of the more straightforward ways is through an intermediate data structure that annotates syntax with its fixity.

example-functional-compiler's People

Contributors

natefaubion avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.