haskell-github-trust / replace-megaparsec Goto Github PK
View Code? Open in Web Editor NEWStream editing with Haskell Megaparsec parsers
License: BSD 2-Clause "Simplified" License
Stream editing with Haskell Megaparsec parsers
License: BSD 2-Clause "Simplified" License
README a template stack-based shell script
A parser combinator which is like many
except non-greedy.
manyNongreedy
:: ParserT e s m a -- ^ parse this as few times as possible
-> ParserT e s m b -- ^ parse the rest of the input
-> ParserT e s m ([a],b)
manyNongreedy p rest = undefined
The obvious way to implement this is to try
to parse rest
, and if if fails then backtrack and parse one p
, then try
to parse rest
again, et cetera. Not super performant, but maybe very useful.
Some discussion:
https://www.reddit.com/r/haskell/comments/3b4ztr/a_neat_way_of_doing_nongreedy_parsing_with_parsec/
Oh this is exactly the same as manyTill_
.
Make sure we test that the parser is correctly consuming its input.
to BSD-2
Make sure we test that the parser is correctly calculating its position.
That would be awesome.
Output constraints not expressed in the type system:
The output list will not be empty. If the input is the empty string then the output will be [Left ""]
.
The output list will not contain two consecutive Left
s.
The output list may contain two consecutive Right
s.
Another comparison: PHP preg_replace_callback
https://www.php.net/manual/en/function.preg-replace-callback.php
Two important test cases: find x
in
x x x x x x x x x x x x
x
Make sure we test that the parser will continue in event of a parse that
fails on an operation like read
.
In module Parsereplace.Attoparsec
attoparsec has match
http://hackage.haskell.org/package/attoparsec-0.13.2.2/docs/Data-Attoparsec-ByteString.html#v:match
attoparsec has a Monoid instance for Chunk
http://hackage.haskell.org/package/attoparsec-0.13.2.2/docs/Data-Attoparsec-Types.html#t:Chunk
But Attoparsec does not work for String. So Megaparsec is most general, and supported first.
This may be a ghc bug:
test-textStub: internal error: Oops! Entered absent arg Arg: $dOrd
Type: Ord e
In module `Replace.Megaparsec'
(GHC version 9.4.3 for x86_64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
compare to sed
Move the streamEditT
docs to streamEdit
.
parser-combinators >= 1.2.0 because we need manyTill_
Consider:
The internal user state is reset on backtracking but StateT is not.
Warnings:
'ghc-options: -O2' is rarely needed. Check that it is giving a real benefit and not just imposing longer compile times on your users.
The Cabal file for replace-megaparsec
states simply that megaparsec
is a dependency, but that seems incorrect: in fact, versions of megaparsec will have to at least be >= 7.0.0 to work.
This because the code for replace-megaparsec
uses the anySingle
function, introduced in megaparsec 7.0.0.
Thus, the Cabal file should give the dependency as:
, megaparsec >= 7.0.0
(The bounds might in fact be tighter than that; I haven't checked whether megaparsec 7.0.0 will in fact work.)
Like manyTill_
, but specialization allows for efficient capture of the preceding string.
Acts like a takeWhile
which is predicated beyond just the next token. Be careful not to look too far ahead; if the end parser looks to the end of the input then anyTill will be O(NĀ²).
https://hackage.haskell.org/package/megaparsec-8.0.0/docs/Text-Megaparsec.html#v:takeWhileP
https://hackage.haskell.org/package/base-4.12.0.0/docs/Data-List.html#v:takeWhile
end may be a zero-consumption parser (combine with lookahead
), in which case after anyTill
succeeds, then the parser input state will be at the beginning of the place where end matched.
Rip-off the doc examples in http://hackage.haskell.org/package/regex-tdfa/docs/Text-Regex-TDFA.html
Make sure that Text.Megaparsec.match
and Text.Megaparsec.Stream
are properly linked in the haddock that is generated on Hackage.
This goes into a loop because many
succeeds but consumes no input.
streamEdit (many lowerChar) (fmap toUpper) "as12df"
doesMatchExist = case (sepCap sep input) of
[Left _] -> False
otherwise -> True
Does laziness short-circuit doesMatchExist
correctly?
firstMatch = find isRight $ sepCap sep input
Does laziness short-ciruit firstMatch
correctly?
https://hackage.haskell.org/package/bytestring-tree-builder
According to the benchmarks this builder implementation beats all the alternatives. It is especially well-suited for generating strict bytestrings, beating the standard builder by at least the factor of 4.
Restricting search to regular grammar is a performance optimization. Recursion in functions was also once considered a controversial performance compromise. Times change.
https://vanemden.wordpress.com/2014/06/18/how-recursion-got-into-programming-a-comedy-of-errors-3/
Implement
sepEndBy' :: MonadPlus m => m a -> m sep -> m ([a], Maybe sep)
or
skipManyTill' :: MonadPlus m => m a -> m end -> m ([a], Maybe end)
Hi, I have some bizarre behaviour that I pushed to this repository.
https://github.com/locallycompact/replace-megaparsec-bug
I am on NixOS so my commands are with --nix
If I build with stack --nix build
and then run stack --nix exec -- replace-megaparsec-bug-exe
I get a callstack with Prelude.undefined.
[I] lc@aiur ~/replace-megaparsec-bug (master)>
stack --nix build && stack --nix exec -- replace-megaparsec-bug-exe
replace-megaparsec-bug-exe: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at src/Replace/Megaparsec.hs:211:21 in replace-megaparsec-1.4.2.0-8z2kgUAlSuYEyo1NOrKi1C:Replace.Megaparsec
However if I run the executable ./Script.hs
as a stack script with the exact same content, it executes and returns.
[I] lc@aiur ~/replace-megaparsec-bug (master)> ./Script.hs
[]
write test suite for Text (and Bytestring?)
Implement
sepCapWithin :: m mask -> m a -> m [Either (Tokens s), a]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ššš
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ā¤ļø Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.