kerzum / pire Goto Github PK
View Code? Open in Web Editor NEWThis project forked from yandex/pire
Perl Incompatible Regular Expressions library
Home Page: http://github.com/dprokoptsev/pire/wiki
License: GNU Lesser General Public License v3.0
This project forked from yandex/pire
Perl Incompatible Regular Expressions library
Home Page: http://github.com/dprokoptsev/pire/wiki
License: GNU Lesser General Public License v3.0
This is PIRE, Perl Incompatible Regular Expressions library. This library is aimed at checking a huge amount of text against relatively many regular expressions. Roughly speaking, it can just check whether given text maches the certain regexp, but can do it really fast (more than 400 MB/s on our hardware is common). Even more, multiple regexps can be combined together, giving capability to check the text against apx.10 regexps in a single pass (and mantaining the same speed). Since Pire examines each character only once, without any lookaheads or rollbacks, spending about five machine instructions per each character, it can be used even in realtime tasks. On the other hand, Pire has very limited functionality (compared to other regexp libraries). Pire does not have any Perlish conditional regexps, lookaheads & backtrackings, greedy/nongreedy matches; neither has it any capturing facilities. Pire was developed in Yandex (http://company.yandex.ru/) as a part of its web crawler. More information can be found in README.ru (in Russian), which is yet to be translated. Please report bugs to [email protected] or [email protected]. Quick Start ============= #include <stdio.h> #include <vector> #include <pire/pire.h> Pire::NonrelocScanner CompileRegexp(const char* pattern) { // Transform the pattern from UTF-8 into UCS4 std::vector<Pire::wchar32> ucs4; Pire::Encodings::Utf8().FromLocal(pattern, pattern + strlen(pattern), std::back_inserter(ucs4)); return Pire::Lexer(ucs4.begin(), ucs4.end()) .AddFeature(Pire::Features::CaseInsensitive()) // enable case insensitivity .SetEncoding(Pire::Encodings::Utf8()) // set input text encoding .Parse() // create an FSM .Surround() // PCRE_ANCHORED behavior .Compile<Pire::NonrelocScanner>(); // compile the FSM } bool Matches(const Pire::NonrelocScanner& scanner, const char* ptr, size_t len) { return Pire::Runner(scanner) .Begin() // '^' .Run(ptr, len) // the text .End(); // '$' // implicitly cast to bool } int main() { char re[] = "hello\\s+w.+d$"; char str[] = "Hello world"; Pire::NonrelocScanner sc = CompileRegexp(re); bool res = Matches(sc, str, strlen(str)); printf("String \"%s\" %s \"%s\"\n", str, (res ? "matches" : "doesn't match"), re); return 0; }
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.