Comments (5)
Pls clarify your question. Regexp syntax for pire is pretty standard, although a little restrictive in range specifications and perlish (?:.*)-like things.
You might want to play with examples/pigrep to find out which regexps match which strings.
from pire.
If it correct to say there are several standards, they are much the same but have some differences in supported constructions, for example, POSIX BRE/ERE, Perlre.
Currently, we allow our users to use Perl syntax for regexp in our application but we have advised them not to use some "heavy expressions". Mainly, these regexps are applied for searching patterns in a stream and we try to avoid backtracking and similar stuff.
As I guessed from you comment your have removed the ability of backtracking, am I right ? I think it would be great to have info in docs which regexp constructions from Perlre (or something else?) can be used or just which of them we cannot use.
Thus, If we move to PIRE I will need just disallow backtracking and some ranges? Could you point where I can read about ranges limitations? Ranges are not widely used in our current regexps but I would like to understand which of them need to be corrected if we replace RE2 lib with your lib.
I will definitely try example which you have suggested but I need to understand the whole scope of limitation.
from pire.
Range limitations are mostly syntactic (e.g. pire disallows unescaped dashes, even in ranges like [a-z0-9-] where it's prettly clear that the last dash means a dash literally).
Concerning backtracking and stuff, backtracking things are essentially hacks allowing perlre (whose regexp-match routine has exponential complexity) to strip off certain branches of match tree, thus giving match routines a performance boost. Pire, utilizing linear-complex match algorithm, does not need any, so you may just write arbitrary complex regexp and be sure that if it manages to compile into a Scanner, it won't impose any performance penalties, no matter how heavy and complex the regexp was.
I'm not sure you were asking about this, but one of major features pire lacks compared to perl is ability to match previously captured fragment of input text (smth like /([A-Z]+)_\1/). In fact, a set of strings matched by this "regexp" exceeds beyond regular language family and thus is beyond pire's scope.
Back to the question, allowed syntactic elements are:
- pipe for alternatives;
- Klene star, plus, question mark or braces for various kinds of repetition;
- parentheses for groupings;
- character ranges in brackets (smth like [A-Za-z0-9@#$%]).
Nevertheless, these elements should allow you to express an arbitrary complex regexp.
from pire.
Thanks for your answer, now I've got the point.
One more question. How stable is it? Do you use it in your production environment?
from pire.
Sure. It has been used in Yandex production environment for several years.
from pire.
Related Issues (20)
- Make pire_inline capable of cross-compiling regexps
- typo in docs HOT 1
- discrepancy in pigrep: it prints "(stdin)" if file is "-" and prints nothing if no file provided HOT 3
- can't build from the package produced by `make distcheck` because of missing file re_parser.y HOT 1
- add method Size() to class Lexer HOT 1
- fix build instructions for "On *nix, from the tarball"
- Lexer.Parse called multiple times results in errors HOT 2
- Make pire_inline capable of cross-compiling regexps
- pattern without ^ and $ matches nothing unless Surrounded
- Inconsistency in run.h: LongestSuffix uses Step, ShortestSuffix uses scanner.Next
- add LettersCount() to SlowScanner
- What is the purpose of LoadedScanner.m_actions?
- Compile Error HOT 2
- Encoding issues HOT 3
- Conan package
- Change licence HOT 1
- unidata generation tool HOT 1
- Make a new release HOT 1
- Compilation error on Windows
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pire.