Git Product home page Git Product logo

minilex's Introduction

MiniLex

Minilex is a very simple lexical analyzer generator. Minilex itself is written in Python, but it generates a completely portable analyzer in C++.

Compared to other lexical analyzer generators, Minilex is not very advanced. However, that is partially by design. The goal of this is to have a simple generator with a lower learning curve than that of Flex or Lex, and unlike Flex, it is meant to be easier to integrate with build systems.

I created this because I was tired of manually copying and pasting my lexical analyzer for each new project :) The one I originally created was working rather well, but it was tedious to do it manually, so I thought, why not just create a Python script for this?

Usage

Short answer: minilex [output directory] [base code]

The input is the only required parameter; this parameter is defines the lexical structure of keywords, sybmols, and comments. The base code is where the provided lexical base is located. The output directory is wherever you want the code.

Note that you can generate a config file (see config.py) that contains the default output and base code paths.

Creating a Lex file

Creating a lex file is very easy. Take this for example:

"func" = Func
"is" = Is
"end" = End
"var" = Var
"return" = Return

';' = SemiColon
'=' = Assign
':' = Colon
':=' = Assign2

@comment_line = #
@comment_line = //
@comment_block = /* */

While the ordering does not matter, the example is organized to define keywords, symbols, and comments. Keywords are defined with quotation marks. Symbols are defined with single quotes. However, please note that symbols can be composed of multiple characters. For both symbols and keywords, the left value is always the textual element, and the right value corresponds to values in the TokenType enumeration.

To create comments, use "@comment_line" to define single line comments, and "@comment_block" to define multi-line comments. For multi-line comments, a space separates the beginning and ending tags. Note that the beginning and ending tags can be different lengths. However, patterns cannot be reused.

Literals

The parser by default will scan integer, hex, string, and character literals. Integers are defined as normal numbers, hex values are defined as 0x[0-9, A-F], string literals are defined as any character between two double quotation marks, and character literals are defined as a single character between two single quotation marks. Any text that is not a keyword or pre-defined literal is returned as an identifier.

NOTE: I may change this behavior later to provide some control over the literals.

Generated Code

Three files are generated: "lex.cpp", "lex.hpp", and "lex_debug.cpp". The header is the best way to understand how to use the lexical analyzer. It contains the definitions of the Token class, the available token types, and the scanner class. The "lex.cpp" file implements these definitions, and the "lex_debug.cpp" class implements a debug function for visualizing the lexical analyzer.

You should be able to incorporate these three files directly into your project without any issues. You will need to add a dependency to your build system to run minilex, but other than that, no special care is needed.

minilex's People

Contributors

pflynn157 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.