Git Product home page Git Product logo

Comments (5)

dprokoptsev avatar dprokoptsev commented on July 21, 2024

Pls clarify your question. Regexp syntax for pire is pretty standard, although a little restrictive in range specifications and perlish (?:.*)-like things.
You might want to play with examples/pigrep to find out which regexps match which strings.

from pire.

buhtr avatar buhtr commented on July 21, 2024

If it correct to say there are several standards, they are much the same but have some differences in supported constructions, for example, POSIX BRE/ERE, Perlre.

Currently, we allow our users to use Perl syntax for regexp in our application but we have advised them not to use some "heavy expressions". Mainly, these regexps are applied for searching patterns in a stream and we try to avoid backtracking and similar stuff.

As I guessed from you comment your have removed the ability of backtracking, am I right ? I think it would be great to have info in docs which regexp constructions from Perlre (or something else?) can be used or just which of them we cannot use.

Thus, If we move to PIRE I will need just disallow backtracking and some ranges? Could you point where I can read about ranges limitations? Ranges are not widely used in our current regexps but I would like to understand which of them need to be corrected if we replace RE2 lib with your lib.

I will definitely try example which you have suggested but I need to understand the whole scope of limitation.

from pire.

dprokoptsev avatar dprokoptsev commented on July 21, 2024

Range limitations are mostly syntactic (e.g. pire disallows unescaped dashes, even in ranges like [a-z0-9-] where it's prettly clear that the last dash means a dash literally).

Concerning backtracking and stuff, backtracking things are essentially hacks allowing perlre (whose regexp-match routine has exponential complexity) to strip off certain branches of match tree, thus giving match routines a performance boost. Pire, utilizing linear-complex match algorithm, does not need any, so you may just write arbitrary complex regexp and be sure that if it manages to compile into a Scanner, it won't impose any performance penalties, no matter how heavy and complex the regexp was.

I'm not sure you were asking about this, but one of major features pire lacks compared to perl is ability to match previously captured fragment of input text (smth like /([A-Z]+)_\1/). In fact, a set of strings matched by this "regexp" exceeds beyond regular language family and thus is beyond pire's scope.

Back to the question, allowed syntactic elements are:

  • pipe for alternatives;
  • Klene star, plus, question mark or braces for various kinds of repetition;
  • parentheses for groupings;
  • character ranges in brackets (smth like [A-Za-z0-9@#$%]).

Nevertheless, these elements should allow you to express an arbitrary complex regexp.

from pire.

buhtr avatar buhtr commented on July 21, 2024

Thanks for your answer, now I've got the point.
One more question. How stable is it? Do you use it in your production environment?

from pire.

dprokoptsev avatar dprokoptsev commented on July 21, 2024

Sure. It has been used in Yandex production environment for several years.

from pire.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.