Git Product home page Git Product logo

sharkdp / numbat Goto Github PK

View Code? Open in Web Editor NEW
845.0 5.0 33.0 10.81 MB

A statically typed programming language for scientific computations with first class support for physical dimensions and units

Home Page: https://numbat.dev

License: Apache License 2.0

Rust 92.19% HTML 2.31% JavaScript 1.77% Vim Script 0.23% Handlebars 2.22% Shell 0.30% CSS 0.83% PHP 0.13%
calculator physics programming-language statically-typed terminal-based units web-app

numbat's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

numbat's Issues

SI prefixes

This needs to be checked rather sooner than later. Does our "user-defined units" concepts fare well with SI prefixes that can come directly before the identifier? Or do we only support those prefixes on specified units (we probably should)? Or only on a small set of builtin units (that would be a bit sad)?

kilometer => prefix(kilo) · identifier(meter)
km => prefix(kilo) · identifier(meter)

Run all examples in unit tests

  • Run everything in examples and make sure it passes
  • Run everything in examples/errors and make sure it fails (with the right error)

Dimension annotations

Further expand the idea of dimension-annotations via : (or related syntax), allowing users to write:

let d: Length = 2 meter
let t: Time = 2 meter # error

# simple functions
kineticEnergy(m: Mass, v: Speed) -> Energy = 0.5 m v^2
kineticEnergy(5 gram, 100 km/h)
kineticEnergy(5 kg, 3 meter/second^2) # error

# generics?
let f<T>(x: T) -> T^2 = 2*x^2

Compile everything, then run

"overload" interpret_statements to compile everything first, the execute. Right now, we compile each statement separately and then immediately run the VM.

Interesting challenges

List of interesting challenges when implementing a parser, type checker and bytecode interpreter for scientific calculations with physical dimensions and units:

  • Parser
    • Precedence of unary minus compared with exponentiation depends on which side of the ^ we are on: -2^3 == -(2^3), but 2^-3 == 2^(-3).
    • Parsing SI prefixes. Problems with non-normal units (US customary, Imperial), only allow metric prefixes for SI units, long-and-short prefixes, …
    • Ambiguous edge cases:
      • ft: foot or femto tonne?
      • cd: candela or centi days?
    • 2e3 is a number but 2e is 2 × e.
    • Implicit multiplication
  • Type checker
    • The problem with exponentiation (see README)
    • Minor: asymmetry between add/sub and mul/div: the first two can produce type check errors, the latter can't.
    • Type inference as a system of equations
    • We need generics, otherwise we can not type something as simple as sqrt(x) = x^(1/2)
    • Polymorphic zero? meter + 0 should be fine. But general case can probably not be solved, e.g. meter + (second - second)
    • Rational exponents (eg definition of planck length)
  • Unit representation
    • Angles (are angles dimensionless, does deg decay to a scalar?, sin(scalar)?, sin(angle)?)
    • Units like cm/m or inch/meter which have the dimension of a scalar.
    • Units with additive offsets like °C or °D
    • Units with rational exponents
    • Formatting of units, e.g. J/(m²·K)
  • Evaluation
    • Prefixes need to be handled separately, otherwise you run into numerical problems when e.g. 1 / 2ns is being converted to GHz
    • Distinguish between decimal, rational, integer?
    • Simplification of units kg m²/s² back to J. How are unit system boundaries treated?
    • Exact conversions between units?
  • Good error messages
    • Incompatible dimensions/units
  • Features
    • User-defined units

Functions: FFI

Allow calling of foreign functions (Rust code).

extern fn sqrt<T>(x: T) -> T^(1/2)
extern fn sqr<T>(x: T) -> T^2
extern fn sin(x: Angle) -> Scalar

Modules / Imports

Allow for imports such that we can structure the Prelude into multiple documents (e.g. separate maths from physics to allow users to easily construct custom preludes).

Do we want to go for a simple "include this file" mechanism or implement something cleaner like an actual module system with namespaces, etc.?

Advanced conversion operations

users might also like to explicitly convert to:

  • a base SI representation, like kg m³ / s², e.g. expr -> base
  • a shortest-possible representation, e.g. J·m e.g. expr -> shortest
  • to the canonical representation (see also #3), e.g. expr -> canonical

Do we also need/want to allow for conversions to dimensions? is something like expr -> Energy useful? Users could always add type annotations, i.e. use sth like expr: Energy to make sure that something is an energy.

see #3 for implicit conversions

Constants and variables

It would be cool to define constants like pi or e in Numbat code itself (like we currently do in the prelude). That would require us to introduce the concept of constants / immutable variables.

WASM

As early as possible, make sure that we can build everything with WASM. We shouldn't rely on dependcies/technologies that don't work with WASM.

Syntax for defining unit properties

We need a way to declare whether or not units allow prefixes. And we need a way to add aliases.

Ideas:

unit meter[si_prefixes,short="m"]: Length

#[si_prefixes,short="m"]
unit meter: Length

or we always use new declarations for aliases:

unit minute = …
unit minutes = minute
unit min = minute

Functions: first iteration

  • Function declarations
  • Local variables?
  • Arity errors

Things to think about:

  • Mathematica-style function calls, super convenient for repls: 4.0 // sqrt.
  • should foo(x) always be parsed as a function call? or could it be an implicit multiplication?

Unit conversions

  • Allow for non-standard units like inch
  • Implement conversion operator

Functions: generics / parametric polymorphism

Add support for generic functions, to allow things like

fn sqr(x) = x^2

Syntax

We can go the Rust route and do something like

fn sqr<T>(x: T) -> T^2 = x^2

Or we could add a special syntax for type variables, e.g. $T:

fn sqr(x: $T) -> $T^2 = x^2

Or go the Haskell way and use lowercase expressions as variables:

fn sqr(x: d) -> d^2 = x^2

that could work well with inferred types (#29):

f(x: d0, y: d1, z: d2) -> dR = …

Functions: recursion

In order for recursion to make any sense, we will probably need some boolean types, comparison operations, and maybe a ternary operator or an if <cond> then <expr1> else <expr2> expression.

Functions: type inference

It would be very cool if we could infer function parameter types and return types. For example, given hypothetical code like

fn g(t: Time) -> Speed = …

fn f(d, t) = (d + 1 meter) / g(t)

we could easily infer from d + 1 meter that d: Length, and from g(t) that t: Time. Furthermore, we could compute the return type to be Length / Speed =Time, leading to an inferred type of

fn f(d: Length, t: Time) -> Time

This also relates to generics (#25), as some (most?) functions would probably have inferred generic types. For example, in the absence of any annotations,

kineticEnergy(m, v) = 1/2 * m * v^2

would have an inferred type of

kineticEnergy<T0, T1>(m: T0, v: T1) -> T0 * T1^2 = 1/2 * m * v^2

Possible constraints

  • Addition, Subtraction, Conversion: these binary operators act as constraints on their operands. For example, an addition like
    expr1 + expr2
    
    imposes a simple [expr1] == [expr2] constraint. Similarly, expr1 -> expr2 also forces expr1 to have the same dimension as expr2.
  • Function application: A function call like
    f(expr1, expr1, …, exprN)
    
    imposes N constraints, one for each argument: [exprK] == [parameter k of f]. A simple example would be something like
    fn normalize_len(l: Length) -> Length = …
    fn f(d, alpha) = normalize_length(d) * sin(alpha)
    
    where we could use the function calls to easily get the types of d and alpha.
  • Exponentiation: Since the exponent in an expression like
    expr1 ^ expr2
    
    always needs to be a scalar, we can infer [expr2] == 1. In general, we can not assume that expr1 is also a scalar (meter^2 is perfectly fine, for example). So there is also no constraint for expr1.

Formalization

Given a (partially annotated) function definition like one of these (Cx is a concrete type):

f(x0, x1, x2, …) = …
f(x0, x1: C1, x2, …) = …
f(x0, x1, x2, …) -> C_ret = …
f(x0, x1: C1, x2, …) -> C_ret = …
…

We start by filling in blanks with free type variables T0, T1, … and T_ret:

f(x0: T0, x1: T1, x2: T2, …) -> T_ret = …
f(x0: T0, x1: C1, x2: T2, …) -> T_ret = …
f(x0: T0, x1: T1, x2: T2, …) -> C_ret = …
f(x0: T0, x1: C1, x2: T2, …) -> C_ret = …
…

Next, we walk the AST of the right hand side expression and record all constraints that we discover (see above). For example, given a declaration like f(x0, x1, x2) = sqrt((x0 · x2² + second) · second / x1²) -> meter, we would identify:

  • T0 · T2^2 == Time (from x0 · x2² + second)
  • ((T0 · T2^2) · Time / T1²)^(1/2) == T0^(1/2) · T1 · T2 · Time^(1/2) == Length (from the -> conversion operator)

These constraints would then have to be converted to a system of linear equations on the exponents.

  1 * exponents0 + 0 * exponents1 + 2 * exponents2 == (1, 0)
1/2 * exponents0 + 1 * exponents1 + 1 * exponents2 == (1, -1/2)

where exponents_i and the right hand sides would be vectors with one rational number entry for each of the D physical dimensions. In total, this is a set of N_constraints · D equations, where the left-hand side will be the same regardless of the physical dimension.

Examples

Unique solution

Example 1

g(t: Length) -> Length = …
f(x, y, z) = g(sqrt(x)/y) * (y + 2 * second) * 2^z
  • From 2^z, we infer [z]: 1.
  • From y + 2 * second, we infer [y]: Time
  • From g(sqrt(x)/y), we infer sqrt(x)/y: Length, i.e. sqrt(x): Length · Time or x: Length² · Time².
  • Finally, there return type can be computed as Length · Time
f(x: Length² · Time², y: Time, z: 1) -> Length · Time = g(sqrt(x)/y) * (y + 2 * second) * 2^z

Example 2

f(x) = x + x^2

should be inferred as

f(x: Scalar) -> Scalar

Example 3

(from https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-391.pdf)

f(x,y,z) = x*x + y*y*y*y*y + z*z*z*z*z*z

should ideally be inferred as

f<T>(x: T^15, y: T^6, z: T^5) -> T^30

Underdetermined

For a function definition like

f(x, y) = x + y^3

we could infer one of those (equivalent) types:

f(x: T, y: T^(1/3)) -> T = x + y^3
f(x: T^3, y: T) -> T^3 = x + y^3

Overdetermined

A function like

f(x) = (x + meter)/(y + second)

should lead to a type check error.

Canonical unit for dimension

With our new dimension concept, we can define a canonical unit for a certain dimension. For example: meter: Length, joule: Energy. Or if someone uses Imperial, ft: Length. Then, if the user doesn't ask for an explicit conversion, we can figure out the target dimension and use the canonical unit.

This will allow us to solve sharkdp/purescript-quantities#27

Currencies

Maybe we could have currencies as a separate module that needs to be imported. Something like

import currencies

30 € -> THB

Internally, that module would fetch currency data and define all those currencies.

Introduce optimization stage

After type checking, we could introduce an optimization stage in the compiler that could - in theory - perform a lot of optimizations, as most (so far: all) of the code is pure and could be evaluated at compile time.

There are also some simple optimizations that would generate cleaner byte code, like:

  • Negate(Constant(n)) => Constant(-n)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.