Comments (14)
Ah, I think I've found a bug here, one moment while I type up a minimal example.
from earley.
{-# LANGUAGE RecursiveDo #-}
import Control.Applicative
import System.Environment
import Text.Earley
import Text.Earley.Mixfix
data Expr
= Var String
| Plus Expr Expr
deriving (Eq, Ord, Show)
expr :: Associativity -> Grammar r String (Prod r String String Expr)
expr a = mdo
atomicExpr <- rule $ Var <$> satisfy (/= "+")
expr <- mixfixExpression
[[([Just (symbol "+"), Nothing, Nothing], a)]]
atomicExpr
(\x y -> case (x,y) of
([Just "+", Nothing, Nothing], [e1,e2]) -> Plus e1 e2)
return expr
main :: IO ()
main = do
(x:xs) <- getArgs
print $ fullParses $ parser (expr (toAssoc x)) $ concatMap words xs
where
toAssoc "left" = LeftAssoc
toAssoc "right" = RightAssoc
toAssoc _ = NonAssoc
-- This looks right
$ ./main left + + 5 6 7
([Plus (Plus (Var "5") (Var "6")) (Var "7")],Report {position = 5, expected = [], unconsumed = []})
-- This should not succeed
$ ./main left + 5 + 6 7
([Plus (Var "5") (Plus (Var "6") (Var "7"))],Report {position = 5, expected = [], unconsumed = []})
-- This should not succeed
$ ./main right + + 5 6 7
([Plus (Plus (Var "5") (Var "6")) (Var "7")],Report {position = 5, expected = [], unconsumed = []})
-- This should have succeeded
$ ./main right + 5 + 6 7
([],Report {position = 2, expected = [], unconsumed = ["+","6","7"]})
-- This looks right
$ ./main non + + 5 6 7
([Plus (Plus (Var "5") (Var "6")) (Var "7")],Report {position = 5, expected = [], unconsumed = []})
-- This looks right
$ ./main non + 5 + 6 7
([Plus (Var "5") (Plus (Var "6") (Var "7"))],Report {position = 5, expected = [], unconsumed = []})
from earley.
Heh, after thinking about this for a little while longer, I'm not sure that associativity makes any sense at all for anything but an operator with a hole on the far left and the far right; i.e. "if_then_else_" should have no associativity. Does that sound correct? If so - it should probably be prevented with the type system.
EDIT: And to add to this, I don't believe it makes sense for an operator to have two consecutive holes or two consecutive identifiers, although this isn't clear in the Holey
type
from earley.
Yeah, the mixfix parser is limited to Holey
identifiers with alternating holes and may otherwise give unexpected results. I did have a version where Holey
prevented anything else and where associativity could only be given to identifiers with holes on both sides, but I deemed it too annoying to work with.
We should at least make this clear in the documentation of the mixfixExpression
function. It should be possible to make it work with multiple consecutive holes though. Feel free to have a go at this. Otherwise I'll have a go at it soon (but this week I'm terribly busy).
from earley.
Sure, I can have a go at it. I was thinking of something like:
data Operator r e t ident expr where
-- ++_, if_then_else_, etc.
Prefix :: Vector (Prod r e t ident) n
-> (Vector expr n -> expr)
-> Operator r e t ident expr
-- _++, _[_], etc.
Postfix :: Vector (Prod r e t ident) n
-> (Vector expr n -> expr)
-> Operator r e t ident expr
-- _+_, _if_else_, etc
Infix :: Vector (Prod r e t ident) (S n)
-> (Vector expr (S (S n)) -> expr)
-> Associativity
-> Operator r e t ident expr
-- [[_]], (_), etc
Closed :: Vector (Prod r e t ident) (S n)
-> (Vector expr n -> expr)
-> Operator r e t ident expr
-- replaces mixfixExpression
buildExpressionGrammar
:: [[Operator r e t ident expr]]
-> Prod r e t ident
-> Grammar r e (Prod r e t expr)
Was this the same as your first version, which was too annoying to work with? For me, it doesn't seem that bad. Example:
import Data.Sized.Vector (Vector(..))
import qualified Data.Sized.Vector as V
data Expr
= Var String -- "5"
| Plus Expr Expr -- "5+5"
| Negate Expr -- "-5"
| If Expr Expr Expr -- "_if_else_"
buildExpressionParser
[ [ Infix (V.singleton (symbol "+") (\(e1 :- e2 :- Nil) -> Plus e1 e2) LeftAssoc ]
, [ Prefix (V.singleton (symbol "-") (\(e :- Nil) -> Negate e) ]
, [ Infix (symbol "if" :- symbol "else" :- Nil) (\(e1 :- e2 :- e3 :- Nil) -> If e1 e2 e2) RightAssoc ]
]
varExpression
As usual apologies if any of this doesn't typecheck as I just wrote it from scratch :P
from earley.
The way I did it was just using normal ADTs, but similar in spirit. If we can get it working for consecutive holes much of that machinery would however be unnecessary. I would prefer that. :)
from earley.
Ok, so there are sort of two issues I'm seeing. One, consecutive holes seem to not work properly with the existing mixfixExpression
. And two, the function is not very type safe (the user has no assurance that the Holey ident -> [expr] -> expr
corresponds to any of the operators defined).
Just to clarify, are you saying that you want to rectify the first problem, but would rather not pursue some more radical rethinking of the types, in favor of simplicity and ease of use?
from earley.
Yeah, that's right. The first problem is more important IMO.
I think what's desirable for the second one depends on the usecase. If you statically know the identifiers like in your example, then the extra assurance and flexibility in choosing semantic actions separately for each identifier is nice. If you're generating a parser from a user-given table, then the current solution is probably easier to use since you often want the same semantic action for everything anyway, and dependent types would only get in the way. I'm not sure what the best course of action is.
from earley.
Ok, sounds good. So, here's another issue which might be related to implementing consecutive holes:
What does it mean for an operator without holes on both sides to have an associativity? For example, even just simplifying a consecutive-hole operator like if _ _
to just ~ _
, wouldn't you always want to parse ~ ~ x
as ~ (~ x)
, and therefore prefix/postfix operators always have "right"/"left" associativity, respectively?
from earley.
Yeah, that's probably the right idea. You could permit e.g. ~_
to be left-associative but that would mean that ~ ~ x
doesn't parse. I'm not sure if that's useful behaviour.
from earley.
Ok, and then does that idea extend to mixfix operators that only have a hole on one side? i.e. if_then_else_
is essentially a prefix operator that should be right associative, and if it was given left associativity, if a then b else if c then d else e
shouldn't parse? (I haven't double checked what the current behavior of mixfixExpression
is)
from earley.
I think that is the current behaviour, and it also extends to e.g. if_then_else_
. What's tricky to think of is associativity for operators that are non-alternating: What does it mean for if__
to be left- or right-associative? Here you could give two associativities meaning.
from earley.
That's right, and I think it might make some sense to treat inner holes the same as if they were surrounded by tokens: they're just top-level expressions. So, if _ _
should always be right-associative, like other prefix expressions, and the first _
is any expression whatsoever.
from earley.
I believe this is fixed now. Please reopen the issue if it doesn't work. :)
from earley.
Related Issues (20)
- list = listLike? HOT 3
- Q: How to do error recovery? HOT 1
- Wrong(?) result on some grammars with infinite results HOT 10
- Is Earley’s implementation related to Marpa? HOT 3
- Add a `eof` terminal HOT 5
- Behavior of Semigroup/Monoid instances for Prod? HOT 5
- Capture result as well as matched tokens HOT 2
- Generate ABNF/EBNF from Earley grammar HOT 1
- Suggestion: Make it so that `<$` rules return at most one result. HOT 3
- Parsing into a list delimited by a token HOT 5
- INDENT and DEDENT HOT 5
- feature: constraint validation of production parse result HOT 7
- Please add tags "test ambiguity" etc HOT 2
- Crashes if grammar is infinitely ambiguous on given input HOT 11
- Is it okey to use "many" and heavily use Applicative parsing techniques? HOT 2
- Diverges with sepEndBy from parser-combinator HOT 1
- Report doesn't include enough information HOT 2
- Is there any possible optimization for getting non-exhaustive outputs for ambiguous grammars HOT 1
- Disambiguate/Filter productions HOT 1
- Missing case for Constraint for fmap and <*> HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from earley.