Git Product home page Git Product logo

nimly's Introduction

nimly

github_workflow nimble

Lexer Generator and Parser Generator as a Macro Library in Nim.

With nimly, you can make lexer/parser by writing definition in formats like lex/yacc. nimly generates lexer and parser by using macro in compile-time, so you can use nimly not as external tool of your program but as a library.

niml

niml is a macro to generate a lexer.

macro niml

macro niml makes a lexer. Almost all part of constructing a lexer is done in compile-time. Example is as follows.

## This makes a LexData object named myLexer.
## This lexer returns value with type ``Token`` when a token is found.
niml myLexer[Token]:
  r"if":
    ## this part converted to procbody.
    ## the arg is (token: LToken).
    return TokenIf()
  r"else":
    return TokenElse()
  r"true":
    return TokenTrue()
  r"false":
    return TokenFalse()
  ## you can use ``..`` instead of ``-`` in ``[]``.
  r"[a..zA..Z\-_][a..zA..Z0..9\-_]*":
    return TokenIdentifier(token)
  ## you can define ``setUp`` and ``tearDown`` function.
  ## ``setUp`` is called from ``open``, ``newWithString`` and
  ## ``initWithString``.
  ## ``tearDown`` is called from ``close``.
  ## an example is ``test/lexer_global_var.nim``.
  setUp:
    doSomething()
  tearDown:
    doSomething()

Meta charactors are as following:

  • \: escape character
  • .: match with any charactor
  • [: start of character class
  • |: means or
  • (: start of subpattern
  • ): end of subpattern
  • ?: 0 or 1 times quantifier
  • *: 0 or more times quantifire
  • +: 1 or more times quantifire
  • {: {n,m} is n or more and m or less times quantifire

In [], meta charactors are as following

  • \: escape character
  • ^: negate character (only in first position)
  • ]: end of this class
  • -: specify character range (.. can be used instead of this)

Each of followings is recognized as character set.

  • \d: [0..9]
  • \D: [^0..9]
  • \s: [ \t\n\r\f\v]
  • \S: [^ \t\n\r\f\v]
  • \w: [a..zA..Z0..9_]
  • \w: [^a..zA..Z0..9_]

nimy

nimy is a macro to generate a LALR(1) parser.

macro nimy

macro nimy makes a parser. Almost all part of constructing a parser is done in compile-time. Example is as follows.

## This makes a LexData object named myParser.
## first cloud is the top-level of the BNF.
## This lexer recieve tokens with type ``Token`` and token must have a value
## ``kind`` with type enum ``[TokenTypeName]Kind``.
## This is naturally satisfied when you use ``patty`` to define the token.
nimy myParser[Token]:
  ## the starting non-terminal
  ## the return type of the parser is ``Expr``
  top[Expr]:
    ## a pattern.
    expr:
      ## proc body that is used when parse the pattern with single ``expr``.
      ## $1 means first position of the pattern (expr)
      return $1

  ## non-terminal named ``expr``
  ## with returning type ``Expr``
  expr[Expr]:
    ## first pattern of expr.
    ## ``LPAR`` and ``RPAR`` is TokenKind.
    LPAR expr RPAR:
      return $2

    ## second pattern of expr.
    ## ``PLUS`` is TokenKind.
    expr PLUS expr
      return $2

You can use following EBNF functions:

  • XXX[]: Option (0 or 1 XXX). The type is seq[xxx] where xxx is type of XXX.
  • XXX{}: Repeat (0 or more XXX). The type is seq[xxx] where xxx is type of XXX.

Example of these is in next section.

Example

tests/test_readme_example.nim is an easy example.

import unittest
import patty
import strutils

import nimly

## variant is defined in patty
variant MyToken:
  PLUS
  MULTI
  NUM(val: int)
  DOT
  LPAREN
  RPAREN
  IGNORE

niml testLex[MyToken]:
  r"\(":
    return LPAREN()
  r"\)":
    return RPAREN()
  r"\+":
    return PLUS()
  r"\*":
    return MULTI()
  r"\d":
    return NUM(parseInt(token.token))
  r"\.":
    return DOT()
  r"\s":
    return IGNORE()

nimy testPar[MyToken]:
  top[string]:
    plus:
      return $1

  plus[string]:
    mult PLUS plus:
      return $1 & " + " & $3

    mult:
      return $1

  mult[string]:
    num MULTI mult:
      return "[" & $1 & " * " & $3 & "]"

    num:
      return $1

  num[string]:
    LPAREN plus RPAREN:
      return "(" & $2 & ")"

    ## float (integer part is 0-9) or integer
    NUM DOT[] NUM{}:
      result = ""
      # type of `($1).val` is `int`
      result &= $(($1).val)
      if ($2).len > 0:
        result &= "."
      # type of `$3` is `seq[MyToken]` and each elements are NUM
      for tkn in $3:
        # type of `tkn.val` is `int`
        result &= $(tkn.val)

test "test Lexer":
  var testLexer = testLex.newWithString("1 + 42 * 101010")
  testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNORE

  var
    ret: seq[MyTokenKind] = @[]

  for token in testLexer.lexIter:
    ret.add(token.kind)

  check ret == @[MyTokenKind.NUM, MyTokenKind.PLUS, MyTokenKind.NUM,
                 MyTokenKind.NUM, MyTokenKind.MULTI,
                 MyTokenKind.NUM, MyTokenKind.NUM, MyTokenKind.NUM,
                 MyTokenKind.NUM, MyTokenKind.NUM, MyTokenKind.NUM]

test "test Parser 1":
  var testLexer = testLex.newWithString("1 + 42 * 101010")
  testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNORE

  var parser = testPar.newParser()
  check parser.parse(testLexer) == "1 + [42 * 101010]"

  testLexer.initWithString("1 + 42 * 1010")

  parser.init()
  check parser.parse(testLexer) == "1 + [42 * 1010]"

test "test Parser 2":
  var testLexer = testLex.newWithString("1 + 42 * 1.01010")
  testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNORE

  var parser = testPar.newParser()
  check parser.parse(testLexer) == "1 + [42 * 1.01010]"

  testLexer.initWithString("1. + 4.2 * 101010")

  parser.init()
  check parser.parse(testLexer) == "1. + [4.2 * 101010]"

test "test Parser 3":
  var testLexer = testLex.newWithString("(1 + 42) * 1.01010")
  testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNORE

  var parser = testPar.newParser()
  check parser.parse(testLexer) == "[(1 + 42) * 1.01010]"

Install

  1. nimble install nimly

Now, you can use nimly with import nimly.

vmdef.MaxLoopIterations Problem

During compiling lexer/parser, you can encounter errors with interpretation requires too many iterations. You can avoid this error to use the compiler option maxLoopIterationsVM:N which is available since nim v1.0.6.

See #11 to detail.

Contribute

  1. Fork this
  2. Create new branch
  3. Commit your change
  4. Push it to the branch
  5. Create new pull request

Changelog

See changelog.rst.

Developing

You can use nimldebug and nimydebug as a conditional symbol to print debug info.

example: nim c -d:nimldebug -d:nimydebug -r tests/test_readme_example.nim

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.