Git Product home page Git Product logo

fauton's Introduction


Fauton logo

Fauton

An ecosystem of packages to work with automaton and parsers (dfa/nfa/e-nfa/regex/cfg/pda)

Features

  1. Full typescript support
  2. Easy to use api
  3. High test coverage
  4. Supports both node and browser environment (except a few packages)
  5. Well documented with examples

Packages

  • @fauton/cfg Github NPM: A package to work with cfg
  • @fauton/fa Github NPM: A package to work with finite automata
  • @fauton/regex Github : A package to work with regex validation and parsing
  • @fauton/testing Github NPM : A package to test your automaton (regex/dfa/nfa/e-nfa/cfg)
  • @fauton/language Github : A package to generate language from a given set of tokens

Algorithm Sources

Wikipedia sources for all the algorithms used in the package

  1. Thompson-McNaughton-Yamada algorithm for converting regex to e-nfa
  2. Hopcroft algorithm for dfa-minimization
  3. Rabin–Scott powerset construction algorithm to convert nfa to dfa
  4. Shunting-Yard algorithm to convert regex string from infix to postfix
  5. Chomsky Normal Form Algorithm to make parsing a string easier
  6. Cocke–Younger–Kasami Parsing algorithm using a CFG
  7. Earley Parser algorithm for parsing strings that belong to a given context-free language
  8. LL parser a top-down parser for a restricted context-free language

Credits

Big thanks to all these wonderful repos.

  1. Orban Regular expression engine that uses the Thompson-McNaughton-Yamada algorithm implemented in Python.
  2. CFGChecker A program that cross references a context free grammar with a given language.
  3. CFG Epsilon Removal A detailed article on how to remove epsilon from CFG
  4. python-formal-langs-practicum-automata-cfg Automata, Context-free-grammar classes (implementation of CYK algorithm, converting grammar to Chomsky normal form, Thompson algo for building automaton from regex, etc.)
  5. earley-parser-js Tiny JavaScript implementation of context-free languages parser - Earley parser (including generation of the parsing-forest).
  6. probabilistic-earley-parser-javascript An efficient implementation of a probabilistic Context Free Grammar parser in Javascript
  7. https://github.com/caleb531/automata A Python library for simulating finite automata, pushdown automata, and Turing machines

Contributors

  1. Safwan Shaheer github Author, Maintainer

Feel free to submit a pull request or open a new issue, contributions are more than welcome.

fauton's People

Contributors

dependabot[bot] avatar devorein avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fauton's Issues

Remove test logic when creating automaton

Test logic should not be a part of automaton, as it's only required when testing. Furthermore, if we want to test our automaton based on a different logic we need to create a completely new automaton. It should be removed when creating an automaton and provided only when testing an automaton

Validate inputs for config, when testing automata

Right now there is no validation for Automata testing configs.

const finiteAutomataTest = new FiniteAutomataTest();
finiteAutomataTest.test([
	{
		automaton: null,
		options: {
			type: 'generate',
			range: {
				maxLength: 10,
			},
			outputFiles: {
				input: false,
				rejected: false,
				incorrect: false,
			},
		},
	},
]);
  1. Not passing the logs directory should throw an error.
  2. Not passing a valid automaton should throw an error
  3. Not passing type for options would throw an error
  4. Not passing file, generate or custom for options.type would throw an error.
  5. For file, generate or custom options.type the relevant properties must be present
  6. When options.type is generate and random we should check whether total number of unique random numbers can be generated from the given minLength and maxLength

Caching and batching generated input strings for testing fa's with same parameters

Example with generate option

const { FiniteAutomataTest} = require('fauton');

const finiteAutomataTest = new FiniteAutomataTest(path.join(__dirname, 'logs'));

finiteAutomataTest.test([
	{
		automaton: startsWithBC,
		options: {
			type: 'generate',
			range: {
				maxLength: 10,
			},
		},
	},
	{
		automaton: doesntStartWithBC,
		options: {
			type: 'generate',
			range: {
				maxLength: 12,
			},
		},
	},
	{
		automaton: startsWith01,
		options: {
			type: 'generate',
			range: {
				maxLength: 10,
			},
		},
	},
]);

For testing the first automaton, we generate all the combinations of the alphabet, from a length of 1 to 10, in the 2nd test we use another automaton having the same alphabets to generate the same combinations albeit with 2 increased lengths. This is wasted computation as we could've easily used the inputs of the first automaton and only generate a few more for the 2nd one. Caching the previous result would greatly help in this regard. But if the alphabets are different it should generate them again rather than using cached results from previous computations

Example with file option

const { FiniteAutomataTest} = require('fauton');

const finiteAutomataTest = new FiniteAutomataTest(path.join(__dirname, 'logs'));

finiteAutomataTest.test([
	{
		automaton: startsWithBC,
		options: {
			type: 'generate',
			filePath: __dirname + '/input1.txt',
		},
	},
	{
		automaton: doesntStartWithBC,
		options: {
			type: 'file',
			filePath: __dirname + '/input1.txt',
		},
	},
	{
		automaton: startsWith01,
		options: {
			type: 'file',
			filePath: __dirname + '/input2.txt',
		},
	},
]);

the same issue as above only difference is that we are creating file streams for each of the input files even if they are the same file. Batching would greatly help improve performance in this regard.

Note that for random we should not batch or cache anything as it's completely randomly generated.

Mark final states in html generated by d3

const { NonDeterministicFiniteAutomaton, Render } = require('fauton');
const path = require('path');

const nfa = new NonDeterministicFiniteAutomaton((_, automatonTestResult) => automatonTestResult, {
  label: 'NFA 1',
  final_states: ['D'],
  start_state: 'A',
  alphabets: ['0', '1'],
  states: ['A', 'B', 'C', 'D'],
  transitions: {
    A: ['A'],
    B: ['D', 'C'],
    D: [null, 'D'],
  },
  epsilon_transitions: {
    A: ['C'],
    C: ['B'],
    D: ['C'],
  },
});
const { graph } = nfa.generateGraphFromString('1101');
Render.graphToHtml(graph, path.join(__dirname, 'index.html'));

For a d3 html graph generated from a nfa, no final states are marked. The above code would generate the following graph
image
But no final states are marked at the leaf nodes.

Check that for dfa all transitions must have all the alphabets

const { DeterministicFiniteAutomaton } = require('fauton');

const startsWithBC = new DeterministicFiniteAutomaton(
	(inputString) => inputString.startsWith('bc'),
	{
		alphabets: ['a', 'b'],
		transitions: {
			A: ['A'],
			B: [null, 'B'],
		},
	}
);

The DFA accepts 2 alphabets a and b but in the transition map for state A, it only indicates where to go when we encounter the symbol a but not for b. This should not be the case for dfa. Even in the 2nd state B, it's gonna transition to null state for symbol a, which is also a violation for DFA. Every state must have an array of state strings that have an equal length as that of the alphabets to ensure we cover all cases. An error should be thrown in this regard

Testing regex rather than only automata

Right now we are only able to test automata rather than literal regex. I would suggest having a regex class called RegularExpression just like DeterministicFiniteAutomata so that we can interchange between them and test them out easily.

Shortened state string for nfa transition record array item values

const nfa1 = new NonDeterministicFiniteAutomaton(()=> true, {
	transitions: {
		alphabets: ['a', 'b', 'c']
		A: ['A-Z', ['A-D', 'F', '0-5']],
	}
});

When we are in state A :-

  1. we encounter symbol 'a' move to all states from A to Z
  2. we encounter symbol 'b' move to all states from A to D, F and 0 to 5

Check that DFA can have only 1 state for a symbol

Right now when creating a dfa the transition map allows a state to have an array of states for a symbol. For example

const { DeterministicFiniteAutomaton } = require('fauton');
const path = require('path');

const startsWithBC = new DeterministicFiniteAutomaton(
	(inputString) => inputString.startsWith('bc'),
	{
		alphabets: ['a', 'b'],
		transitions: {
			// Passing an array of states for symbol 'a', but it can only be an individual state for a dfa
			// array of states is only valid for a nfa
			A: [['B', 'A'], 'A'],
			B: ['A', 'B'],
		},
	}
);

Ideally, the user should get an error message when this issue occurs as this is an invalid dfa. At this moment of the package 0.0.3, this issue doesn't throw an error and acts as if the automaton is a nfa

Test two automaton for equivalence rather than using only a function

There should be an option to test two separate finite automatons rather than only using a function. For example

// Not passing constructor arguments for sake of brevity
const dfa1 = new DeterministicFiniteAutomaton();
const dfa2 = new DeterministicFiniteAutomaton();

const finiteAutomataTest = new FiniteAutomataTest();
finiteAutomataTest.test([
	{
		automatons: [dfa1, dfa2]
		options: {
			type: "generate",
			// ...
		}
	}
])

This would test the two automatons together, rather than using their logic test functions.
This would also somewhat solve the issue for batching and caching input strings as we can now test multiple automatons using similar inputs.

Two merged dfa must have the same alphabets

const { DeterministicFiniteAutomaton } = require('fauton');

const DivisibleBy3DFA = new DeterministicFiniteAutomaton(
	(randomBinaryString) => {
		return parseInt(randomBinaryString, 2) % 3 === 0;
	},
	{
		label: 'DFA 1',
		alphabets: ['1', '0'],
		description: 'Divisible by 3',
		final_states: ['A'],
		start_state: 'A',
		states: ['A', 'B', 'C'],
		transitions: {
			A: ['A', 'B'],
			B: ['C', 'A'],
			C: ['B', 'C'],
		},
	}
);

const startsWithBC = new DeterministicFiniteAutomaton(
	(inputString) => inputString.startsWith('bc'),
	{
		alphabets: ['a', 'b', 'c'],
		description: 'Starts with bc',
		final_states: ['Q3'],
		label: 'starts_with_bc',
		start_state: 'Q0',
		states: ['Q0', 'Q1', 'Q2', 'Q3'],
		transitions: {
			Q0: ['Q2', 'Q1', 'Q2'],
			Q1: ['Q2', 'Q2', 'Q3'],
			Q2: 'loop',
			Q3: 'loop',
		},
	}
);

const DivisibleBy3OrStartsWithBC= DivisibleBy2DFA.OR(startsWithBC );

This should fail with an appropriate message, but the error message generated is extremely vague and difficult to decipher.

Check that for dfa all states must be present in transition map

const { DeterministicFiniteAutomaton } = require('fauton');

const startsWithBC = new DeterministicFiniteAutomaton(
	(inputString) => inputString.startsWith('bc'),
	{
		alphabets: ['a', 'b'],
		states: ['A', 'B', 'C'],
		transitions: {
			A: ['B', 'A'],
			B: ['A', 'B'],
		},
	}
);

Transition for state 'C' is not present, but no error is thrown. Ideally, the user should provide all the states in the transition map, but if not an error should be generated as that's the condition of a dfa.

Alphabet range expansion in automata via a-z.A-Z.0-9

It would be nice to add an alphabet range expansion feature when defining the dfa

Some examples:-

  1. A-D: ['A', 'B', 'C', 'D']
  2. a-c,E-G: ['a', 'b', 'c', 'E', 'F', 'G']
  3. 0-2: [0, 1, 2]
  4. a-C: Throw an error as the highest we can go from a is z

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.