padsproj / pads-haskell Goto Github PK

Haskell binding for PADS

License: Other

Haskell 99.49% Shell 0.51%

pads-haskell's Introduction

pads-haskell

Massive amounts of useful data are stored and processed in ad hoc formats for which common tools like parsers, printers, query engines and format converters are not readily available. Pads/Haskell is a domain-specific language that facilitates the generation of data processing tools for ad hoc formats. Pads/Haskell includes features such as dependent, polymorphic and recursive datatypes, which allow programmers to describe the syntax and semantics of ad hoc data in a concise, easy-to-read notation.

The pads haskell repository contains the code for the Haskell binding for PADS. For more information about the project, see the pads website (www.padsproj.org).

Building

pads-haskell currently requires GHC 8.6.5 and stack resolver lts-13.25.

Setup

To generate an appropriate Stack configuration file and install an appropriate GHC tool chain:

$ stack solver    # Updates stack.yaml if necessary
$ stack setup     # Installs ghc in a sandbox for you

Build

To build pads-haskell:

$ stack build

Testing

To run the automated testing infrastructure:

$ stack test :examples --ghc-options="-ddump-splices"
# Followed by this if you want to see the dumped splice files:
$ find . -name *.dump-splices

To run individual tests do:

$ stack repl
λ> :l Examples.First
...
λ> test
Cases: 89  Tried: 89  Errors: 0  Failures: 0
Counts {cases = 89, tried = 89, errors = 0, failures = 0}
(0.11 secs, 0 bytes)

Contributing and Development

In order to build and view the haddock documentation do the following:

stack haddock
firefox `find .stack-work -name index.html | grep "html/pads-haskell"`

Pull requests are strongly encouraged, though we're more likely to merge them in a timely fashion if they either add small features to existing modules or are new PADS descriptions to add to the examples directory.

pads-haskell's People

Contributors

Stargazers

Watchers

Forkers

mpahrens cronburg rcook ethanpailes chablisaren samcowger lanchiang

pads-haskell's Issues

Generators: order of generation for context-sensitivity

Presently generators annotated on a PADS description get executed (in do-notation-style?) in the order in which they are defined in the description, that is in parsing-order. This makes context-sensitive generators (generators that rely on the value of some other field of the data type) cumbersome because fields can only refer to each other when the reference occurs after (in the file) the referent.

What needs to be done: in the initial step of generating the dataType_genM, we need to do a topological sort of the fields of the data type, where fieldA < fieldB iff the generator for fieldA does not refer in any way to the variable name fieldB. Then the do-notation generator which brings names into scope in the order in which they are generated will function properly (as expected).

An error message must be produced if a topological sort does not exist (cycle dependency)
This feature will interact with generator support for the state monad, namely changing the order in which things are generated will change the behavior of generators which rely on StateT. Therefore some canonical ordering should be chosen (perhaps prefer keeping the default ordering, and only swapping when one generator computation is dependent on the other).

Thorough haddock documentation & getting pads on hackage

Since I mucked around a lot in the internals of this implementation of PADS in implementing a prototype of direct memory (pointer) parsing a while back, I'm planning on documenting a bunch of the internals of pads-haskell as I understood it. This is in particular a good stepping stone to getting this package on Hackage.

This is the command I've been using to build documentation presently:

stack haddock --force-dirty --haddock-internal --haddock-deps

I may not get around to this until towards the end of this month.

Consider using State monad for metadata in PadsParser

Right now all the pads parser combinators that get called by code generator generated code have types that look like PadsParser(rep,md), requiring in many places unnecessasry unwrapping and rewrapping of (rep,md) tuples. Using a state monad for metadata will:

Make parser combinator type signatures cleaner
Separate the semantics of underlying "runtime" details (metadata) from the semantics of Pads language features (constrain, transform, partition, ...)
Make it easier for new developers unfamiliar with the Pads runtime to add new language features

Data types and parsing functions created by the pads quasiquoter should have haddock documentation

This is blocked by the following GHC ticket: https://ghc.haskell.org/trac/ghc/ticket/5467

The lens package also has this issue: ekmett/lens#614

Cleanup extra-source-files in package.yaml

There are a number of large data files currently distributed with the hackage version of PADS. Interested parties should discuss what kind of examples we want to provide to users of the PADS (Haskell version) language.

Perhaps we could find an undergrad(s) interested in writing some more example data descriptions and writing up a tutorial documenting their experience with the language.

Compile-time performance problem in AI.hs

Since adding generation of the internal AST representation of a pads description to the output of the quasiquoter, the AI example takes significantly longer to build (minutes). Offending commit: c535c18

But when I try building with profiling information enabled, the build takes on the order of seconds...

Bug in seekSep / parseManySepTerm

While documenting PadsParser.hs I believe I found a bug where when parsing a list of some type with separators and a terminator, Pads will report successful termination of the parser upon seeing end-of-file without parsing the terminator. I'll write a simple test case in First.hs which exhibits the desired behavior.

Make a style guide

Tabs and inconsistent-space indentation was causing GHC warnings that interfered with debugging
decide on a style and make a CONTRIBUTING.md that includes that as a style guide

Edit: formalized language in this issue since I didn't know it would get moved over to the official project along with the pull request.

Also: Add cabal install instructions from PLD page to Readme.md