Git Product home page Git Product logo

javaccc's Introduction

JavaCCC

A set of tools for Java Compiler Compiler™ (JavaCC™) made in python.

sample

Requirements

These requirements are necessary for the project so please make sure to have them installed on your machine.

Getting started

Run python src -h to get help.

$ python src -h
usage: src [-h] [-i INPUT_FILE] [-o OUTPUT_DIR] [--optimize] [-s SIZE]

A set of tools for Java Compiler Compiler™ (JavaCC™) made in python.

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT_FILE, --input-file INPUT_FILE
                        an input csv file
  -o OUTPUT_DIR, --output-dir OUTPUT_DIR
                        an output directory
  --optimize            optimize automatas (WARNING the optimization process will take A LONG TIME)
  -s SIZE, --size SIZE  image size (default 10)

This project requires a CSV file with the name of the token as the first field and the regular expression as the second field. This CSV file can easily be generated in excel or google sheets, just like this:

sample-file sample-file

After this you can use the command python src -i INPUT_FILE -o OUTPUT_DIRECTORY:

# run command
python src -i Words.csv -o output

The images should be generated in the output directory.

You can optimize the automatas but it may take a LONG TIME.

# run command with optimization
python src -i Words.csv -o output --optimize

You can also change the size of the automatas.

# run command with different size
python src -i Words.csv -o output -s 40

javaccc's People

Contributors

papermonoid avatar

Stargazers

 avatar

Watchers

 avatar

javaccc's Issues

Regex character class compilation

What should a regex character class be compiled to?

A regex character class is a set of valid character for an expression. It's basically a bunch of or expressions. We could for example have the following:

["a", "e", "i", "o", "u"] = "a" | "e" | "i" | "o" | "u"

These two expressions are equivalent and one could be replaced by the other. Although there are many things that regex character classes make more convenient. For example ranges:

["a" - "z"] = "a" | "b" | "c" | "d" | ... | "z"

Regex character classes also allow to get the complement of a set. For example the following expression matches everything BUT digits:

~["0" - "9"] = ???

In our compilers class they taught us how to make an automata for the or expression. But they never told us anything about character classes.

or-automata
These are my notes on the or automata.

Most of the character classes stuff could be implemented as a bunch of or expressions, but it's unclear what's the correct translation for the other expressions such as ranges and set complements.

What should be done?

I've read this article about email regex and it has a diagram which shows how character classes COULD BE implemented, so I'm leaving it here as a suggestion:
email-regex

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.