Git Product home page Git Product logo

Comments (14)

mitchellwrosen avatar mitchellwrosen commented on September 23, 2024

Ah, I think I've found a bug here, one moment while I type up a minimal example.

from earley.

mitchellwrosen avatar mitchellwrosen commented on September 23, 2024
{-# LANGUAGE RecursiveDo #-}

import Control.Applicative
import System.Environment
import Text.Earley
import Text.Earley.Mixfix

data Expr
    = Var String
    | Plus Expr Expr
    deriving (Eq, Ord, Show)

expr :: Associativity -> Grammar r String (Prod r String String Expr)
expr a = mdo
    atomicExpr <- rule $ Var <$> satisfy (/= "+")

    expr <- mixfixExpression
               [[([Just (symbol "+"), Nothing, Nothing], a)]]
               atomicExpr
               (\x y -> case (x,y) of
                  ([Just "+", Nothing, Nothing], [e1,e2]) -> Plus e1 e2)

    return expr

main :: IO ()
main = do
    (x:xs) <- getArgs
    print $ fullParses $ parser (expr (toAssoc x)) $ concatMap words xs
  where
    toAssoc "left"  = LeftAssoc
    toAssoc "right" = RightAssoc
    toAssoc _       = NonAssoc
-- This looks right
$ ./main left + + 5 6 7
([Plus (Plus (Var "5") (Var "6")) (Var "7")],Report {position = 5, expected = [], unconsumed = []})

-- This should not succeed
$ ./main left + 5 + 6 7
([Plus (Var "5") (Plus (Var "6") (Var "7"))],Report {position = 5, expected = [], unconsumed = []})

-- This should not succeed
$ ./main right + + 5 6 7
([Plus (Plus (Var "5") (Var "6")) (Var "7")],Report {position = 5, expected = [], unconsumed = []})

-- This should have succeeded
$ ./main right + 5 + 6 7
([],Report {position = 2, expected = [], unconsumed = ["+","6","7"]})

-- This looks right
$ ./main non + + 5 6 7
([Plus (Plus (Var "5") (Var "6")) (Var "7")],Report {position = 5, expected = [], unconsumed = []})

-- This looks right
$ ./main non + 5 + 6 7
([Plus (Var "5") (Plus (Var "6") (Var "7"))],Report {position = 5, expected = [], unconsumed = []})

from earley.

mitchellwrosen avatar mitchellwrosen commented on September 23, 2024

Heh, after thinking about this for a little while longer, I'm not sure that associativity makes any sense at all for anything but an operator with a hole on the far left and the far right; i.e. "if_then_else_" should have no associativity. Does that sound correct? If so - it should probably be prevented with the type system.

EDIT: And to add to this, I don't believe it makes sense for an operator to have two consecutive holes or two consecutive identifiers, although this isn't clear in the Holey type

from earley.

ollef avatar ollef commented on September 23, 2024

Yeah, the mixfix parser is limited to Holey identifiers with alternating holes and may otherwise give unexpected results. I did have a version where Holey prevented anything else and where associativity could only be given to identifiers with holes on both sides, but I deemed it too annoying to work with.

We should at least make this clear in the documentation of the mixfixExpression function. It should be possible to make it work with multiple consecutive holes though. Feel free to have a go at this. Otherwise I'll have a go at it soon (but this week I'm terribly busy).

from earley.

mitchellwrosen avatar mitchellwrosen commented on September 23, 2024

Sure, I can have a go at it. I was thinking of something like:

data Operator r e t ident expr where
    -- ++_, if_then_else_, etc.
    Prefix  :: Vector (Prod r e t ident) n 
               -> (Vector expr n -> expr) 
               -> Operator r e t ident expr
    -- _++, _[_], etc.
    Postfix :: Vector (Prod r e t ident) n 
               -> (Vector expr n -> expr) 
               -> Operator r e t ident expr
    -- _+_, _if_else_, etc
    Infix   :: Vector (Prod r e t ident) (S n) 
               -> (Vector expr (S (S n)) -> expr) 
               -> Associativity 
               -> Operator r e t ident expr
    -- [[_]], (_), etc
    Closed  :: Vector (Prod r e t ident) (S n) 
               -> (Vector expr n -> expr) 
               -> Operator r e t ident expr

-- replaces mixfixExpression
buildExpressionGrammar
    :: [[Operator r e t ident expr]]
    -> Prod r e t ident
    -> Grammar r e (Prod r e t expr)

Was this the same as your first version, which was too annoying to work with? For me, it doesn't seem that bad. Example:

import Data.Sized.Vector (Vector(..))
import qualified Data.Sized.Vector as V

data Expr
    = Var String -- "5"
    | Plus Expr Expr -- "5+5"
    | Negate Expr -- "-5"
    | If Expr Expr Expr -- "_if_else_"

buildExpressionParser
    [ [ Infix (V.singleton (symbol "+") (\(e1 :- e2 :- Nil) -> Plus e1 e2) LeftAssoc ]
    , [ Prefix (V.singleton (symbol "-") (\(e :- Nil) -> Negate e) ]
    , [ Infix (symbol "if" :- symbol "else" :- Nil) (\(e1 :- e2 :- e3 :- Nil) -> If e1 e2 e2) RightAssoc ]
    ]
    varExpression

As usual apologies if any of this doesn't typecheck as I just wrote it from scratch :P

from earley.

ollef avatar ollef commented on September 23, 2024

The way I did it was just using normal ADTs, but similar in spirit. If we can get it working for consecutive holes much of that machinery would however be unnecessary. I would prefer that. :)

from earley.

mitchellwrosen avatar mitchellwrosen commented on September 23, 2024

Ok, so there are sort of two issues I'm seeing. One, consecutive holes seem to not work properly with the existing mixfixExpression. And two, the function is not very type safe (the user has no assurance that the Holey ident -> [expr] -> expr corresponds to any of the operators defined).

Just to clarify, are you saying that you want to rectify the first problem, but would rather not pursue some more radical rethinking of the types, in favor of simplicity and ease of use?

from earley.

ollef avatar ollef commented on September 23, 2024

Yeah, that's right. The first problem is more important IMO.

I think what's desirable for the second one depends on the usecase. If you statically know the identifiers like in your example, then the extra assurance and flexibility in choosing semantic actions separately for each identifier is nice. If you're generating a parser from a user-given table, then the current solution is probably easier to use since you often want the same semantic action for everything anyway, and dependent types would only get in the way. I'm not sure what the best course of action is.

from earley.

mitchellwrosen avatar mitchellwrosen commented on September 23, 2024

Ok, sounds good. So, here's another issue which might be related to implementing consecutive holes:

What does it mean for an operator without holes on both sides to have an associativity? For example, even just simplifying a consecutive-hole operator like if _ _ to just ~ _, wouldn't you always want to parse ~ ~ x as ~ (~ x), and therefore prefix/postfix operators always have "right"/"left" associativity, respectively?

from earley.

ollef avatar ollef commented on September 23, 2024

Yeah, that's probably the right idea. You could permit e.g. ~_ to be left-associative but that would mean that ~ ~ x doesn't parse. I'm not sure if that's useful behaviour.

from earley.

mitchellwrosen avatar mitchellwrosen commented on September 23, 2024

Ok, and then does that idea extend to mixfix operators that only have a hole on one side? i.e. if_then_else_ is essentially a prefix operator that should be right associative, and if it was given left associativity, if a then b else if c then d else e shouldn't parse? (I haven't double checked what the current behavior of mixfixExpression is)

from earley.

ollef avatar ollef commented on September 23, 2024

I think that is the current behaviour, and it also extends to e.g. if_then_else_. What's tricky to think of is associativity for operators that are non-alternating: What does it mean for if__ to be left- or right-associative? Here you could give two associativities meaning.

from earley.

mitchellwrosen avatar mitchellwrosen commented on September 23, 2024

That's right, and I think it might make some sense to treat inner holes the same as if they were surrounded by tokens: they're just top-level expressions. So, if _ _ should always be right-associative, like other prefix expressions, and the first _ is any expression whatsoever.

from earley.

ollef avatar ollef commented on September 23, 2024

I believe this is fixed now. Please reopen the issue if it doesn't work. :)

from earley.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.