Git Product home page Git Product logo

oniguruma's Introduction

Fuzzing Status

Oniguruma

Oniguruma is the only open source software attacked on Google search. (Issue #234)

https://github.com/kkos/oniguruma

Oniguruma is a modern and flexible regular expressions library. It encompasses features from different regular expression implementations that traditionally exist in different languages.

Character encoding can be specified per regular expression object.

Supported character encodings:

ASCII, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE, EUC-JP, EUC-TW, EUC-KR, EUC-CN, Shift_JIS, Big5, GB18030, KOI8-R, CP1251, ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5, ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10, ISO-8859-11, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16

  • GB18030: contributed by KUBO Takehiro
  • CP1251: contributed by Byte
  • doc/SYNTAX.md: contributed by seanofw

Notice (from 6.9.6)

When using configure script, if you have the POSIX API enabled in an earlier version (disabled by default in 6.9.5) and you need application binary compatibility with the POSIX API, specify "--enable-binary-compatible-posix-api=yes" instead of "--enable-posix-api=yes". Starting in 6.9.6, "--enable-posix-api=yes" only supports source-level compatibility for 6.9.5 and earlier about POSIX API. (Issue #210)

Version 6.9.9

  • Update Unicode version 15.1.0
  • NEW API: ONIG_OPTION_MATCH_WHOLE_STRING
  • Fixed: (?I) option was not enabled for character classes (Issue #264).
  • Changed specification to check for incorrect POSIX bracket (Issue #253).
  • Changed [[:punct:]] in Unicode encodings to be compatible with POSIX definition. (Issue #268)
  • Fixed: ONIG_OPTION_FIND_LONGEST behavior

Version 6.9.8

  • Update Unicode version 14.0.0
  • Whole options
    • (?C) : ONIG_OPTION_DONT_CAPTURE_GROUP
    • (?I) : ONIG_OPTION_IGNORECASE_IS_ASCII
    • (?L) : ONIG_OPTION_FIND_LONGEST
  • Fixed some problems found by OSS-Fuzz

Version 6.9.7

  • NEW API: ONIG_OPTION_CALLBACK_EACH_MATCH
  • NEW API: ONIG_OPTION_IGNORECASE_IS_ASCII
  • NEW API: ONIG_SYNTAX_PYTHON
  • Fixed some problems found by OSS-Fuzz

Version 6.9.6

  • NEW: configure option --enable-binary-compatible-posix-api=[yes/no]
  • NEW API: Limiting the maximum number of calls of subexp-call
  • NEW API: ONIG_OPTION_NOT_BEGIN_STRING / NOT_END_STRING / NOT_BEGIN_POSITION
  • Fixed behavior of ONIG_OPTION_NOTBOL / NOTEOL
  • Fixed many problems found by OSS-Fuzz
  • Fixed many problems found by Coverity
  • Fixed CVE-2020-26159 (This turned out not to be a problem later. #221)
  • Under cygwin and mingw, generate and install the libonig.def file (Issue #220)

License

BSD license.

Install

Case 1: Linux distribution packages

  • Fedora: dnf install oniguruma-devel
  • RHEL/CentOS: yum install oniguruma
  • Debian/Ubuntu: apt install libonig5
  • Arch: pacman -S oniguruma
  • openSUSE: zypper install oniguruma

Case 2: Manual compilation on Linux, Unix, and Cygwin platform

  1. autoreconf -vfi (* case: configure script is not found.)

  2. ./configure

  3. make

  4. make install

  • uninstall

    make uninstall

  • configuration check

    onig-config --cflags onig-config --libs onig-config --prefix onig-config --exec-prefix

Case 3: Windows 64/32bit platform (Visual Studio)

  • build library

    .\make_win.bat

    onig_s.lib: static link library onig.dll: dynamic link library

  • make test programs

    .\make_win.bat all-test

Alternatively, you can build and install oniguruma using vcpkg dependency manager:

  1. git clone https://github.com/Microsoft/vcpkg.git
  2. cd vcpkg
  3. ./bootstrap-vcpkg.bat
  4. ./vcpkg integrate install
  5. ./vcpkg install oniguruma

The oniguruma port in vcpkg is kept up to date by microsoft team members and community contributors. If the version is out of date, please create an issue or pull request on the vcpkg repository.

Regular Expressions

See doc/RE or doc/RE.ja for Japanese.

Usage

Include oniguruma.h in your program. (Oniguruma API) See doc/API for Oniguruma API.

If you want to disable UChar type (== unsigned char) definition in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then include oniguruma.h.

If you want to disable regex_t type definition in oniguruma.h, define ONIG_ESCAPE_REGEX_T_COLLISION and then include oniguruma.h.

Example of the compiling/linking command line in Unix or Cygwin, (prefix == /usr/local case)

cc sample.c -L/usr/local/lib -lonig

If you want to use static link library(onig_s.lib) in Win32, add option -DONIG_EXTERN=extern to C compiler.

Sample Programs

File Description
sample/callout.c example of callouts
sample/count.c example of built-in callout *COUNT
sample/echo.c example of user defined callouts of name
sample/encode.c example of some encodings
sample/listcap.c example of the capture history
sample/names.c example of the named group callback
sample/posix.c POSIX API sample
sample/regset.c example of using RegSet API
sample/scan.c example of using onig_scan()
sample/simple.c example of the minimum (Oniguruma API)
sample/sql.c example of the variable meta characters
sample/user_property.c example of user defined Unicode property

Test Programs

File Description
sample/syntax.c Perl, Java and ASIS syntax test.
sample/crnl.c --enable-crnl-as-line-terminator test

Source Files

File Description
oniguruma.h Oniguruma API header file (public)
onig-config.in configuration check program template
regenc.h character encodings framework header file
regint.h internal definitions
regparse.h internal definitions for regparse.c and regcomp.c
regcomp.c compiling and optimization functions
regenc.c character encodings framework
regerror.c error message function
regext.c extended API functions (deluxe version API)
regexec.c search and match functions
regparse.c parsing functions.
regsyntax.c pattern syntax functions and built-in syntax definitions
regtrav.c capture history tree data traverse functions
regversion.c version info function
st.h hash table functions header file
st.c hash table functions
oniggnu.h GNU regex API header file (public)
reggnu.c GNU regex API functions
onigposix.h POSIX API header file (public)
regposerr.c POSIX error message function
regposix.c POSIX API functions
mktable.c character type table generator
ascii.c ASCII encoding
euc_jp.c EUC-JP encoding
euc_tw.c EUC-TW encoding
euc_kr.c EUC-KR, EUC-CN encoding
sjis.c Shift_JIS encoding
big5.c Big5 encoding
gb18030.c GB18030 encoding
koi8.c KOI8 encoding
koi8_r.c KOI8-R encoding
cp1251.c CP1251 encoding
iso8859_1.c ISO-8859-1 (Latin-1)
iso8859_2.c ISO-8859-2 (Latin-2)
iso8859_3.c ISO-8859-3 (Latin-3)
iso8859_4.c ISO-8859-4 (Latin-4)
iso8859_5.c ISO-8859-5 (Cyrillic)
iso8859_6.c ISO-8859-6 (Arabic)
iso8859_7.c ISO-8859-7 (Greek)
iso8859_8.c ISO-8859-8 (Hebrew)
iso8859_9.c ISO-8859-9 (Latin-5 or Turkish)
iso8859_10.c ISO-8859-10 (Latin-6 or Nordic)
iso8859_11.c ISO-8859-11 (Thai)
iso8859_13.c ISO-8859-13 (Latin-7 or Baltic Rim)
iso8859_14.c ISO-8859-14 (Latin-8 or Celtic)
iso8859_15.c ISO-8859-15 (Latin-9 or West European with Euro)
iso8859_16.c ISO-8859-16 (Latin-10)
utf8.c UTF-8 encoding
utf16_be.c UTF-16BE encoding
utf16_le.c UTF-16LE encoding
utf32_be.c UTF-32BE encoding
utf32_le.c UTF-32LE encoding
unicode.c common codes of Unicode encoding
unicode_fold_data.c Unicode folding data
windows/testc.c Test program for Windows (VC++)

oniguruma's People

Contributors

az13js avatar contextnerror avatar cotequeiroz avatar data-man avatar diizzyy avatar efiop avatar hediyi avatar ingramz avatar isaachier avatar itchyny avatar iwillspeak avatar jakub-zwolakowski avatar k-takata avatar kkos avatar kornelski avatar lgtm-migrator avatar lopopolo avatar maflcko avatar maxnordlund avatar mhei avatar nikic avatar petk avatar phoebehui avatar seanofw avatar shenglei10 avatar swordow avatar syohex avatar timgates42 avatar xcorail avatar zmatsuo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

oniguruma's Issues

Curly braces with Oniguruma 6

The pattern %{(.*?)} yields an error "unmatched close parenthesis". This pattern is supposed to match Grok expressions, like %{HOSTNAME}. It worked with Oniguruma 5.9.6.

I am using ONIG_OPTION_DEFAULT and ONIG_SYNTAX_DEFAULT.

Is there any way to make it work with Oniguruma 6?

out of bounds read in mbc_enc_len / utf8.c

The string "\g" will cause an out of bounds read in mbc_enc_len. Code:

#include <oniguruma.h>
int main()
{
    regex_t *reg;
    unsigned char inp[2] = {'\\', 'g' };

    onig_new(&reg, inp, inp + 2, ONIG_OPTION_DEFAULT,
         ONIG_ENCODING_UTF8, ONIG_SYNTAX_DEFAULT, 0);
}

As usual: tested with develop tree, found with libfuzzer and asan.

asan error:

==10986==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe31c2c742 at pc 0x000000561b46 bp 0x7ffe31c2bc40 sp 0x7ffe31c2bc38
READ of size 1 at 0x7ffe31c2c742 thread T0
    #0 0x561b45 in mbc_enc_len /mnt/ram/oniguruma1/src/utf8.c:65:22
    #1 0x559574 in onigenc_mbc_enc_len_end /mnt/ram/oniguruma1/src/regenc.c:117:9
    #2 0x561c4d in mbc_to_code /mnt/ram/oniguruma1/src/utf8.c:99:9
    #3 0x4fbfbe in fetch_token /mnt/ram/oniguruma1/src/regparse.c:3443:9
    #4 0x4f6f7e in parse_regexp /mnt/ram/oniguruma1/src/regparse.c:5297:7
    #5 0x4f6f7e in onig_parse_make_tree /mnt/ram/oniguruma1/src/regparse.c:5351
    #6 0x51e543 in onig_compile /mnt/ram/oniguruma1/src/regcomp.c:5279:7
    #7 0x544468 in onig_new /mnt/ram/oniguruma1/src/regcomp.c:5518:7
    #8 0x4f233f in main /mnt/ram/oniguruma1/oob-mbc_enc_len.c:7:5
    #9 0x7f57bc43378f in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r2/work/glibc-2.23/csu/../csu/libc-start.c:289
    #10 0x4198f8 in _start (/mnt/ram/oniguruma1/a.out+0x4198f8)

Address 0x7ffe31c2c742 is located in stack of thread T0 at offset 66 in frame
    #0 0x4f224f in main /mnt/ram/oniguruma1/oob-mbc_enc_len.c:3

  This frame has 2 object(s):
    [32, 40) 'reg'
    [64, 66) 'inp' <== Memory access at offset 66 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /mnt/ram/oniguruma1/src/utf8.c:65:22 in mbc_enc_len
Shadow bytes around the buggy address:
  0x10004637d890: 00 00 00 00 00 00 00 00 00 00 00 f2 f2 f2 f2 f2
  0x10004637d8a0: f2 f2 f2 f2 00 00 f3 f3 00 00 00 00 00 00 00 00
  0x10004637d8b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004637d8c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004637d8d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x10004637d8e0: f1 f1 f1 f1 00 f2 f2 f2[02]f3 f3 f3 00 00 00 00
  0x10004637d8f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004637d900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004637d910: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004637d920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10004637d930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==10986==ABORTING

segfault / null pointer access in next_state_val()

Passing the byte sequence 0x5b, 0xff, 0x30 to onig_new() crashes with a null pointer access in next_state_val() (regparse.c). Tested with latest develop branch and libfuzzer.

Test code:

#include <oniguruma.h>
int main()
{
    regex_t *reg;
    unsigned char inp[3] = { 0x5b, 0xff, 0x30 };

    onig_new(&reg, inp, inp + 3, ONIG_OPTION_DEFAULT,
         ONIG_ENCODING_UTF8, ONIG_SYNTAX_DEFAULT, 0);
}

Asan stack trace:

==18989==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000534603 bp 0x7fffa020ff10 sp 0x7fffa020fc90 T0)
    #0 0x534602 in next_state_val /mnt/ram/oniguruma/src/regparse.c:4005:7
    #1 0x51f529 in parse_char_class /mnt/ram/oniguruma/src/regparse.c:4222:11
    #2 0x513836 in parse_exp /mnt/ram/oniguruma/src/regparse.c:5056:11
    #3 0x5106fb in parse_branch /mnt/ram/oniguruma/src/regparse.c:5221:7
    #4 0x5072bd in parse_subexp /mnt/ram/oniguruma/src/regparse.c:5258:7
    #5 0x4faebf in parse_regexp /mnt/ram/oniguruma/src/regparse.c:5303:7
    #6 0x4fa704 in onig_parse_make_tree /mnt/ram/oniguruma/src/regparse.c:5339:7
    #7 0x53e4ef in onig_compile /mnt/ram/oniguruma/src/regcomp.c:5279:7
    #8 0x54e9c2 in onig_new /mnt/ram/oniguruma/src/regcomp.c:5518:7
    #9 0x4f21b4 in main (/mnt/ram/oniguruma/a.out+0x4f21b4)
    #10 0x7f67975e478f in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r2/work/glibc-2.23/csu/../csu/libc-start.c:289
    #11 0x419708 in _start (/mnt/ram/oniguruma/a.out+0x419708)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /mnt/ram/oniguruma/src/regparse.c:4005:7 in next_state_val

out of bounds read in next_state_val / regparse.c

Sample code:

#include <oniguruma.h>
int main()
{
    regex_t *reg;
    unsigned char *inp = "[0-0-\xe2  ";

    onig_new(&reg, inp, inp + 8, ONIG_OPTION_DEFAULT,
         ONIG_ENCODING_UTF8, ONIG_SYNTAX_DEFAULT, 0);
}

asan error:

==11115==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60600000f4cc at pc 0x00000051aa28 bp 0x7fff5dbf3f50 sp 0x7fff5dbf3f48
READ of size 4 at 0x60600000f4cc thread T0
    #0 0x51aa27 in next_state_val /mnt/ram/oniguruma1/src/regparse.c:4001:7
    #1 0x5105bf in parse_char_class /mnt/ram/oniguruma1/src/regparse.c:4218:11
    #2 0x504720 in parse_exp /mnt/ram/oniguruma1/src/regparse.c:5052:11
    #3 0x5028cb in parse_branch /mnt/ram/oniguruma1/src/regparse.c:5217:7
    #4 0x4ffe2d in parse_subexp /mnt/ram/oniguruma1/src/regparse.c:5254:7
    #5 0x4f6f98 in parse_regexp /mnt/ram/oniguruma1/src/regparse.c:5299:7
    #6 0x4f6f98 in onig_parse_make_tree /mnt/ram/oniguruma1/src/regparse.c:5351
    #7 0x51e523 in onig_compile /mnt/ram/oniguruma1/src/regcomp.c:5279:7
    #8 0x544448 in onig_new /mnt/ram/oniguruma1/src/regcomp.c:5518:7
    #9 0x4f2317 in main /mnt/ram/oniguruma1/test.c:7:5
    #10 0x7f62e128c78f in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r2/work/glibc-2.23/csu/../csu/libc-start.c:289
    #11 0x4198f8 in _start (/mnt/ram/oniguruma1/a.out+0x4198f8)

AddressSanitizer can not describe address in more detail (wild memory access suspected).
SUMMARY: AddressSanitizer: heap-buffer-overflow /mnt/ram/oniguruma1/src/regparse.c:4001:7 in next_state_val
Shadow bytes around the buggy address:
  0x0c0c7fff9e40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c0c7fff9e90: fa fa fa fa fa fa fa fa fa[fa]fa fa fa fa fa fa
  0x0c0c7fff9ea0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9eb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9ec0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9ed0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9ee0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==11115==ABORTING

Onig_sd.lib

Hello,

I download a project, and it uses oniguruma library. It uses something like onig_sh.lib which I don't know how to export this one. Our oniguruma just have onig_s.lib. Could you help me with this one?

Best Regards,
Nana,

Bug in mbc_enc_len

The buffer overflow is found with the following code:

#include <stdio.h>
#include "oniguruma.h"

static int
search(regex_t* reg, unsigned char* str, unsigned char* end)
{
  int r;
  unsigned char *start, *range;
  OnigRegion *region;

  region = onig_region_new();

  start = str;
  range = end;
  r = onig_search(reg, str, end, start, range, region, ONIG_OPTION_NONE);
  if (r >= 0) {
    int i;

    fprintf(stderr, "match at %d  (%s)\n", r,
            ONIGENC_NAME(onig_get_encoding(reg)));
    for (i = 0; i < region->num_regs; i++) {
      fprintf(stderr, "%d: (%d-%d)\n", i, region->beg[i], region->end[i]);
    }
  }
  else if (r == ONIG_MISMATCH) {
    fprintf(stderr, "search fail (%s)\n",
            ONIGENC_NAME(onig_get_encoding(reg)));
  }
  else { /* error */
    char s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str(s, r);
    fprintf(stderr, "ERROR: %s\n", s);
    fprintf(stderr, "  (%s)\n", ONIGENC_NAME(onig_get_encoding(reg)));
    return -1;
  }

  onig_region_free(region, 1 /* 1:free self, 0:free contents only */);
  return 0;
}
static int
exec(OnigEncoding enc, OnigOptionType options,
     char* apattern, char* apttern_end, char* astr, char* end)
{
  int r;
  regex_t* reg;
  OnigErrorInfo einfo;
  UChar* pattern = (UChar* )apattern;
  UChar* str     = (UChar* )astr;

  onig_initialize(&enc, 1);

  r = onig_new(&reg, pattern,
               apttern_end,
               options, enc, ONIG_SYNTAX_DEFAULT, &einfo);
  if (r != ONIG_NORMAL) {
    char s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str(s, r, &einfo);
    fprintf(stderr, "ERROR: %s\n", s);
    return -1;
  }

  r = search(reg, str, end);

  onig_free(reg);
  onig_end();
  return 0;
}
int main() {
    regex_t *reg;
    unsigned char str[] = { 0xc7, 0xd6, 0xfe, 0xea, 0xe0, 0xe2, 0x00 };
    unsigned char input[] = { 0x6a, 0x00, 0x01, 0x6c, 0x7b, 0x00, 0x01, 0x6c, 0x7b, 0x32, 0x32, 0x7d, 0x7b, 0x32,
        0x32, 0x7d, 0x7b, 0x32, 0x32, 0x7d, 0x7b, 0x32, 0x32, 0x7d, 0x7b, 0x32, 0x32, 0x7d, 0x7b,
        0x32, 0x32, 0x7d, 0x7b, 0x32, 0x32, 0x7d, 0x7b, 0x32, 0x32, 0x6a, 0x00, 0x01, 0x6c, 0x7b,
        0x1d, 0x1d, 0x1d, 0x1d, 0x1d, 0x1d, 0x1d, 0x1d, 0x1d, 0x1d, 0x1d, 0x1d, 0x1d, 0x1d, 0x1d,
        0x1d, 0x1d, 0x1d, 0x32, 0x32, 0x7d, 0x7b, 0x32, 0x32, 0x7d, 0x7b, 0x32, 0x32, 0x7d, 0x7b,
        0x32, 0x32, 0x7d, 0x7b, 0x32, 0x32, 0x7d, 0x7b, 0x32, 0x7d, 0x7b, 0x32, 0x32, 0x7d, 0x7b,
        0x32, 0x32, 0x7d, 0x7b, 0x32, 0x7d};
    int r = exec( ONIG_ENCODING_UTF8,ONIG_OPTION_IGNORECASE , (char*)input,input+95, (char*) str,str+7 );
        return 0;
}

asan error can be found:

==7085==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffed35e53c7 at pc 0x00000045e958 bp 0x7ffed35e4df0 sp 0x7ffed35e4de0
READ of size 1 at 0x7ffed35e53c7 thread T0
    #0 0x45e957 in mbc_enc_len /home/xie/Downloads/oni/oni-asan-dev/src/utf8.c:66
    #1 0x455c8d in forward_search_range /home/xie/Downloads/oni/oni-asan-dev/src/regexec.c:3162
    #2 0x457c56 in onig_search /home/xie/Downloads/oni/oni-asan-dev/src/regexec.c:3614
    #3 0x401138 in search /home/xie/Downloads/oni/oni-asan-dev/test/testc.c:18
    #4 0x401776 in exec /home/xie/Downloads/oni/oni-asan-dev/test/testc.c:65
    #5 0x401b41 in main /home/xie/Downloads/oni/oni-asan-dev/test/testc.c:81
    #6 0x7f6fd15bd82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #7 0x400f58 in _start (/home/xie/Downloads/oni/oni-asan-dev/test/testc+0x400f58)

Address 0x7ffed35e53c7 is located in stack of thread T0 at offset 39 in frame
    #0 0x40185f in main /home/xie/Downloads/oni/oni-asan-dev/test/testc.c:71

  This frame has 2 object(s):
    [32, 39) 'str' <== Memory access at offset 39 overflows this variable
    [96, 191) 'input'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/xie/Downloads/oni/oni-asan-dev/src/utf8.c:66 mbc_enc_len
Shadow bytes around the buggy address:
  0x10005a6b4a20: 00 00 00 00 00 00 00 00 00 00 00 02 f3 f3 f3 f3
  0x10005a6b4a30: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x10005a6b4a40: 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2
  0x10005a6b4a50: 00 00 00 f4 f2 f2 f2 f2 00 00 00 00 00 00 00 00
  0x10005a6b4a60: 00 00 00 02 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00
=>0x10005a6b4a70: 00 00 00 00 f1 f1 f1 f1[07]f4 f4 f4 f2 f2 f2 f2
  0x10005a6b4a80: 00 00 00 00 00 00 00 00 00 00 00 07 f3 f3 f3 f3
  0x10005a6b4a90: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x10005a6b4aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10005a6b4ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10005a6b4ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
==7085==ABORTING

ABI version bumps ?

Question:
Are the last two ABI version number bumps really needed ?

LTVERSION="2:0:0" -> cygonig-2.dll
LTVERSION="3:0:0" -> cygonig-3.dll
LTVERSION="4:0:0" -> cygonig-4.dll

I saw from HISTORY that functions are added but I don't see removal or change of interfaces
for existing function, so the changes could be compatible for old binary.
Expected guideline is
https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html

Was the increment of age a possibility ?
LTVERSION="3:0:1" -> cygonig-2.dll
LTVERSION="4:0:2" -> cygonig-2.dll

As cygwin package maintainer the ABI version bumps are always additional work.

Match regression

(?:^|\G)(#{1,6})\s*(?=[\S[^#]]) fails to match ## in expression ## This is *great* stuff.

This appears to be a regression as this same expression matches when run against the Atom Node Oniguruma version of the Oniguruma library.

I am not sure the root cause, however, it appears to be related to the (?=[\S[^#]]) portion of the regex, as removing it causes the regex to match again.

[feature request] Please have the function that returns the text of the regular expression from OnigRegex

This is convenient for debugging purposes. For example, I have the system that uses a lot of large pre-compiled regular expressions. When something is wrong, I need to read the regular expressions back to see what's wrong. One way is to keep their texts along with OnigRegex, but this consumes memory to store the texts just for debugging. The expressions are already there, in OnigRegex objects, but I can't read them back.

Please create the API function that will return the text of the expression in the newly allocated char* string. This will be useful primarily for debugging.

[Bug] Not work ^ after ?<=

I use Sublime Text. Sublime Text uses Oniguruma's syntax for regular expressions in syntax definitions. I have bug in Sublime Text and I post issue report in Sublime Text issue tracker. Will Bond, developer of Sublime Text, say me:

The oniguruma regex engine used when lookbehinds are present does not seem to support anchors in lookbehinds

My problem:


I wrote my own small syntax for tables.

My syntax: https://gist.github.com/Kristinita/4105b1b48fb3dfc7ba13a0c09f9fb216
My color scheme for syntax: https://github.com/Kristinita/SashaSublime/blob/master/SashaSublime.tmTheme
Example file, when use this syntax: https://gist.github.com/Kristinita/39286e4165fa74ea54b52aa41b14ef81

In syntax not works pattern, that work correctly on regex.101.com site:

Pattern: (?<=^\| )[0-9]{1,2}(?= )
regex101: https://regex101.com/r/gR8lV6/1

Screen1

I want consume digits between | and space only when | is in the beginning of a line. If I don't write a symbol ^, figures will be consume in other places, see https://regex101.com/r/gR8lV6/2:

Screen2

I do not want to consume digits in other places as in the screen 2. But in Sublime Text pattern

- match: (?<=^\| )[0-9]{1,2}(?= )
  scope: kira.numbers

all digits don't highlighted. Pattern (?<=\| )[0-9]{1,2}(?= ) is work for me in Sublime Text.


Environment

Operating system and version:
Windows 10.0.14393
Sublime Text:
Build 3126

Thanks.

Please fix security issues CVE-2017-9224, CVE-2017-9225, CVE-2017-9226, CVE-2017-9227, CVE-2017-9228, CVE-2017-9229

Buffer Overflow in onigenc_unicode_get_case_fold_codes_by_str()

This buffer overflow is found in the latest develop branch with the code:

#include <oniguruma.h>
int main() {
    regex_t *reg;
    const OnigUChar* inp = (const OnigUChar*)"\x3f\xff\x63\x7f\xff\xff\xff\xff\x4d\x22\x00\x00";
    if (onig_new
        (&reg, inp, inp+12,ONIG_OPTION_IGNORECASE , ONIG_ENCODING_UTF32_BE,
         ONIG_SYNTAX_DEFAULT, 0) == 0)
        onig_free(reg);
    return 0;
}

Error reported in asan:

 =================================================================
==1944==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffc56669ea4 at pc 0x0000004483f9 bp 0x7ffc56669aa0 sp 0x7ffc56669a90
WRITE of size 4 at 0x7ffc56669ea4 thread T0
    #0 0x4483f8 in onigenc_unicode_get_case_fold_codes_by_str /home/xie/Downloads/oni/oni-asan-dev/src/unicode.c:553
    #1 0x44354b in utf32be_get_case_fold_codes_by_str /home/xie/Downloads/oni/oni-asan-dev/src/utf32_be.c:170
    #2 0x4318c5 in expand_case_fold_string /home/xie/Downloads/oni/oni-asan-dev/src/regcomp.c:3431
    #3 0x432279 in setup_tree /home/xie/Downloads/oni/oni-asan-dev/src/regcomp.c:3733
    #4 0x43c41f in onig_compile /home/xie/Downloads/oni/oni-asan-dev/src/regcomp.c:5361
    #5 0x43d092 in onig_new /home/xie/Downloads/oni/oni-asan-dev/src/regcomp.c:5565
    #6 0x401082 in main /home/xie/Downloads/oni/oni-asan-dev/test/testc.c:5
    #7 0x7f4ca3cc682f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #8 0x400ec8 in _start (/home/xie/Downloads/oni/oni-asan-dev/test/testc+0x400ec8)

Address 0x7ffc56669ea4 is located in stack of thread T0 at offset 420 in frame
    #0 0x43160b in expand_case_fold_string /home/xie/Downloads/oni/oni-asan-dev/src/regcomp.c:3411

  This frame has 3 object(s):
    [32, 40) 'prev_node'
    [96, 104) 'srem'
    [160, 420) 'items' <== Memory access at offset 420 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/xie/Downloads/oni/oni-asan-dev/src/unicode.c:553 onigenc_unicode_get_case_fold_codes_by_str
Shadow bytes around the buggy address:
  0x10000acc5380: 00 00 f4 f4 f3 f3 f3 f3 00 00 00 00 00 00 00 00
  0x10000acc5390: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000acc53a0: f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 f4 f4 f4
  0x10000acc53b0: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000acc53c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x10000acc53d0: 00 00 00 00[04]f4 f4 f4 f3 f3 f3 f3 00 00 00 00
  0x10000acc53e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000acc53f0: 00 00 00 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3
  0x10000acc5400: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
  0x10000acc5410: f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f4 f4 f2 f2
  0x10000acc5420: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
==1944==ABORTING

emoji characters

It seems that oniguruma has some problems working with emoji unicode characters. Unit tests in MagicPython are built on top of first-mate library, which uses node-oniguruma, which in turn, uses oniguruma v0.9.6. When we tried to investigate this issue, we discovered, that oniguruma consumes the "๐Ÿ•" character (\xf0\x9f\x95\x90 in UTF-8) with \w or ., but then enters in some strange state. It appears that there are some bugs in Atom editor that might be related to this issue.

Heap corruption in next_state_val() in 15 encodings

This heap corruption is due to a different cause than issue #18 and #21 , the following is found after applying the patches of both. The issue affects latest PHP 5/7 installations with mbstring enabled, when the regular expression is from network, this can be considered as a security issue.

Tested on 32-bit ASAN build, one of the 15 encodings below would cause an out-of-bound write:

$ cat mb_regex_min.php

<?php 
if (!extension_loaded('mbstring')) print "mbstring not loaded.\n";
if (!function_exists('mb_regex_encoding')) print "mb_regex_encoding() is not available\n";
if (!function_exists('mb_ereg_search_pos')) print "mb_ereg_search_pos() is not available\n";
if (!function_exists('mb_ereg_search_init')) print "mb_ereg_search_init() is not available\n";
$encoding = array(
                  'ASCII',
                  'ISO-8859-1',
                  'ISO-8859-2',
                  'ISO-8859-3',	
                  'ISO-8859-4',
                  'ISO-8859-5',
                  'ISO-8859-6',
                  'ISO-8859-7',
                  'ISO-8859-8',	
                  'ISO-8859-9',
                  'ISO-8859-10',
                  'ISO-8859-13',
                  'ISO-8859-14',
                  'ISO-8859-15',
                  'KOI8-R');

$enc_id = rand(0, count($encoding));
echo "*** testing encoding " . $encoding[$enc_id] . " ***\n";
mb_regex_encoding($encoding[$enc_id]);
if(mb_ereg_search_init("a")) {
    var_dump(mb_ereg_search_pos('[\\6000'));
}
?>

$ bin/php mb_regex_min.php

=================================================================
==20048== ERROR: AddressSanitizer: heap-buffer-overflow on address 0xb540d948 at pc 0x9e384dc bp 0xbfbc8948 sp 0xbfbc893c
READ of size 4 at 0xb540d948 thread T0
    #0 0x9e384db in next_state_val /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:4087
    #1 0x9e47185 in parse_char_class /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:4306
    #2 0x9e85627 in parse_exp /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:5272
    #3 0x9ea4210 in parse_branch /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:5437
    #4 0x9ea57ee in parse_subexp /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:5474
    #5 0x9ea7ed2 in parse_regexp /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:5518
    #6 0x9ea7ed2 in onig_parse_make_tree /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:5545
    #7 0x9d62da8 in onig_compile /home/xie/php-7.1.5/ext/mbstring/oniguruma/regcomp.c:5305
    #8 0x9d75ba1 in onig_new /home/xie/php-7.1.5/ext/mbstring/oniguruma/regcomp.c:5550
    #9 0xa1b197b in php_mbregex_compile_pattern /home/xie/php-7.1.5/ext/mbstring/php_mbregex.c:456
    #10 0xa1d70a4 in _php_mb_regex_ereg_search_exec /home/xie/php-7.1.5/ext/mbstring/php_mbregex.c:1241
    #11 0xa1d70a4 in zif_mb_ereg_search_pos /home/xie/php-7.1.5/ext/mbstring/php_mbregex.c:1331
    #12 0xc540ded in ZEND_DO_ICALL_SPEC_RETVAL_USED_HANDLER /home/xie/php-7.1.5/Zend/zend_vm_execute.h:675
    #13 0xc3afba1 in execute_ex /home/xie/php-7.1.5/Zend/zend_vm_execute.h:429
    #14 0xcf50612 in zend_execute /home/xie/php-7.1.5/Zend/zend_vm_execute.h:474
    #15 0xbcb52df in zend_execute_scripts /home/xie/php-7.1.5/Zend/zend.c:1476
    #16 0xb3fd433 in php_execute_script /home/xie/php-7.1.5/main/main.c:2537
    #17 0xcf6e3b1 in do_cli /home/xie/php-7.1.5/sapi/cli/php_cli.c:993
    #18 0x811a9f6 in main /home/xie/php-7.1.5/sapi/cli/php_cli.c:1381
    #19 0xb5fbba82 (/lib/i386-linux-gnu/libc.so.6+0x19a82)
    #20 0x811c8b0 in _start (/home/xie/php_fuzz/bin/php+0x811c8b0)
0xb540d948 is located 8 bytes to the left of 44-byte region [0xb540d950,0xb540d97c)
allocated by thread T0 here:
    #0 0xb61ca854 (/usr/lib/i386-linux-gnu/libasan.so.0+0x16854)
    #1 0x9ea61a3 in node_new /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:1120
    #2 0x9ea61a3 in onig_node_new_alt /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:1257
    #3 0x9ea61a3 in parse_subexp /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:5492
    #4 0x3fffffff (+0x17ffffff)
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/xie/php-7.1.5/ext/mbstring/oniguruma/regparse.c:4118 next_state_val
Shadow bytes around the buggy address:
  0x36a81ad0: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x36a81ae0: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x36a81af0: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x36a81b00: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x36a81b10: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
=>0x36a81b20: fa fa 00 00 00 00 00 04 fa[fa]00 00 00 00 00 04
  0x36a81b30: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x36a81b40: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 04
  0x36a81b50: fa fa 00 00 00 00 00 04 fa fa 00 00 00 00 00 00
  0x36a81b60: fa fa 00 00 00 00 00 00 fa fa fd fd fd fd fd fa
  0x36a81b70: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:     fa
  Heap righ redzone:     fb
  Freed Heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==20048== ABORTING
Aborted

Libfuzzer stub to use clang's fuzz testing functionality

All the memory safety bugs I reported recently were found with libfuzzer.
I'll attach a stub code here that can be used to test oniguruma with libfuzzer. Maybe you want to include that in the source code. If not it can just stay here in the bugtracker.

I've added some usage instructions as a comment. Feel free to reuse under whatever license you feel suitable.
libfuzzer-onig.zip

SIGSEGV in left_adjust_char_head() due to bad dereference

Test code:

#include <stdio.h>
#include "oniguruma.h"

static int
search(regex_t* reg, unsigned char* str, unsigned char* end)
{
  int r;
  unsigned char *start, *range;
  OnigRegion *region;

  region = onig_region_new();

  start = str;
  range = end;
  r = onig_search(reg, str, end, start, range, region, ONIG_OPTION_NONE);
  if (r >= 0) {
    int i;

    fprintf(stderr, "match at %d  (%s)\n", r,
            ONIGENC_NAME(onig_get_encoding(reg)));
    for (i = 0; i < region->num_regs; i++) {
      fprintf(stderr, "%d: (%d-%d)\n", i, region->beg[i], region->end[i]);
    }
  }
  else if (r == ONIG_MISMATCH) {
    fprintf(stderr, "search fail (%s)\n",
            ONIGENC_NAME(onig_get_encoding(reg)));
  }
  else { /* error */
    char s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str(s, r);
    fprintf(stderr, "ERROR: %s\n", s);
    fprintf(stderr, "  (%s)\n", ONIGENC_NAME(onig_get_encoding(reg)));
    return -1;
  }

  onig_region_free(region, 1 /* 1:free self, 0:free contents only */);
  return 0;
}
static int
exec(OnigEncoding enc, OnigOptionType options,
     char* apattern, char* apttern_end, char* astr, char* end)
{
  int r;
  regex_t* reg;
  OnigErrorInfo einfo;
  UChar* pattern = (UChar* )apattern;
  UChar* str     = (UChar* )astr;

  onig_initialize(&enc, 1);

  r = onig_new(&reg, pattern,
               apttern_end,
               options, enc, ONIG_SYNTAX_DEFAULT, &einfo);
  if (r != ONIG_NORMAL) {
    char s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str(s, r, &einfo);
    fprintf(stderr, "ERROR: %s\n", s);
    return -1;
  }

  r = search(reg, str, end);

  onig_free(reg);
  onig_end();
  return 0;
}
int main() {
    static unsigned char str[] = { 0xc7, 0xd6, 0xfe, 0xea, 0xe0, 0xe2, 0x00 };
     OnigUChar* inp = (const OnigUChar*) "\x00\x7c\x2e\x7b\x39\x7d\x7b\x39\x30\x7d\x7b\x39\x7d\x7b\x2c\x39\x30\x30\x7d\x30";
    int r = exec( ONIG_ENCODING_EUC_JP, ONIG_OPTION_NONE, inp, inp+20, (char *) str, str+7 );
    return 0;
}

ASAN output:

=================================================================
==26842==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x00000045eef4 bp 0x7ffe5a633330 sp 0x7ffe5a633310 T0)
    #0 0x45eef3 in left_adjust_char_head /home/xie/Downloads/oni/oni-asan-develop/src/euc_jp.c:194
    #1 0x45914a in onigenc_get_right_adjust_char_head_with_prev /home/xie/Downloads/oni/oni-asan-develop/src/regenc.c:78
    #2 0x4561ba in forward_search_range /home/xie/Downloads/oni/oni-asan-develop/src/regexec.c:3240
    #3 0x457930 in onig_search /home/xie/Downloads/oni/oni-asan-develop/src/regexec.c:3611
    #4 0x401148 in search /home/xie/Downloads/oni/oni-asan-develop/test/testc.c:15
    #5 0x401786 in exec /home/xie/Downloads/oni/oni-asan-develop/test/testc.c:62
    #6 0x40189e in main /home/xie/Downloads/oni/oni-asan-develop/test/testc.c:71
    #7 0x7fd6ce5c682f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #8 0x400f68 in _start (/home/xie/Downloads/oni/oni-asan-develop/test/testc+0x400f68)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/xie/Downloads/oni/oni-asan-develop/src/euc_jp.c:194 left_adjust_char_head
==26842==ABORTING

In regcomp.c:4995

static void
set_optimize_map_info(regex_t* reg, OptMapInfo* m)
{
  int i;

  for (i = 0; i < ONIG_CHAR_TABLE_SIZE; i++)
    reg->map[i] = m->map[i];

  reg->optimize   = ONIG_OPTIMIZE_MAP;
  reg->dmin       = m->mmd.min;
  reg->dmax       = m->mmd.max;  **// set as 19683000**

  if (reg->dmin != ONIG_INFINITE_DISTANCE) {
    reg->threshold_len = reg->dmin + 1;
  }
}

Later reg->dmax is used in pointer arithmetic at forward_search_range, resulting in a bad reference from regexec.c:3238

      if (reg->dmax != ONIG_INFINITE_DISTANCE) {
        *low = p - reg->dmax;
        if (*low > s) {
          *low = onigenc_get_right_adjust_char_head_with_prev(reg->enc, s,
                                          *low, (const UChar** )low_prev);
          if (low_prev && IS_NULL(*low_prev))
            *low_prev = onigenc_get_prev_char_head(reg->enc,
                                                   (pprev ? pprev : s), *low);
        }

Bad dereference:

(gdb) r
Starting program: /home/xie/Downloads/oni/oni-asan-develop/test/testc 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x000000000045eef4 in left_adjust_char_head (start=0x66b140 <str> "\307\326\376\352\340", <incomplete sequence \342>, s=0xffffffffff3a5a8e <error: Cannot access memory at address 0xffffffffff3a5a8e>)
    at euc_jp.c:194
194	  while (!eucjp_islead(*p) && p > start) p--;
(gdb) 

Does not build with node7 on windows

$ if not defined npm_config_node_gyp (node "c:\Program Files\nodejs\node_modules\npm\bin\node-gyp-bin\\..\..\node_modules\node-gyp\bin\node-gyp.js" rebuild )  else (node "" rebuild )
gyp ERR! UNCAUGHT EXCEPTION
gyp ERR! stack Error: spawn msbuild ENOENT
gyp ERR! stack     at exports._errnoException (util.js:1026:11)
gyp ERR! stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:193:32)
gyp ERR! stack     at onErrorNT (internal/child_process.js:359:16)
gyp ERR! stack     at _combinedTickCallback (internal/process/next_tick.js:74:11)
gyp ERR! stack     at process._tickCallback (internal/process/next_tick.js:98:9)
gyp ERR! System Windows_NT 10.0.14393
gyp ERR! command "c:\\program files\\nodejs\\node.exe" "c:\\Program Files\\nodejs\\node_modules\\npm\\node_modules\\node-gyp\\bin\\node-gyp.js" "rebuild"
gyp ERR! cwd D:\github\vscode\node_modules\oniguruma
gyp ERR! node -v v7.0.0
gyp ERR! node-gyp -v v3.4.0
gyp ERR! This is a bug in `node-gyp`.
gyp ERR! Try to update node-gyp and file an Issue if it does not help:
gyp ERR!     <https://github.com/nodejs/node-gyp/issues>
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@github:bpasero/fsevents#vscode (node_modules\chokidar\node_modules\fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"win32","arch":"x64"})
npm ERR! Windows_NT 10.0.14393
npm ERR! argv "c:\\Program Files\\nodejs\\node.exe" "c:\\Program Files\\nodejs\\node_modules\\npm\\bin\\npm-cli.js" "install" "oniguruma"
npm ERR! node v7.0.0
npm ERR! npm  v3.10.8
npm ERR! code ELIFECYCLE

npm ERR! [email protected] install: `node-gyp rebuild`
npm ERR! Exit status 7
npm ERR!
npm ERR! Failed at the [email protected] install script 'node-gyp rebuild'.
npm ERR! Make sure you have the latest version of node.js and npm installed.
npm ERR! If you do, this is most likely a problem with the oniguruma package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR!     node-gyp rebuild
npm ERR! You can get information on how to open an issue for this project with:
npm ERR!     npm bugs oniguruma
npm ERR! Or if that isn't available, you can get their info via:
npm ERR!     npm owner ls oniguruma
npm ERR! There is likely additional logging output above.

npm ERR! Please include the following file with any support request:
npm ERR!     D:\github\vscode\npm-debug.log

Use after free when parsing "(|"

The current develop branch has a use after free bug when parsing the regexp "(|". Here's some example code to trigger this:

#include <oniguruma.h>
int main() {
    regex_t *reg;
    const OnigUChar * inp = (const OnigUChar * )"(|";
    if (onig_new
        (&reg, inp, inp+2, ONIG_OPTION_DEFAULT, ONIG_ENCODING_UTF8,
         ONIG_SYNTAX_DEFAULT, 0) == 0)
        onig_free(reg);
    return 0;
}

I suspect that this got introduced when fixing issue #26.

Here's the stack trace from address sanitizer:

==3544==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600000eea0 at pc 0x0000004f4365 bp 0x7ffd635e32f0 sp 0x7ffd635e32e8
READ of size 4 at 0x60600000eea0 thread T0
    #0 0x4f4364 in onig_node_free /mnt/ram/x/oniguruma1/src/regparse.c:1020:11
    #1 0x4f4314 in onig_node_free /mnt/ram/x/oniguruma1/src/regparse.c:1030:5
    #2 0x508bf1 in parse_enclose /mnt/ram/x/oniguruma1/src/regparse.c:4664:5
    #3 0x508bf1 in parse_exp /mnt/ram/x/oniguruma1/src/regparse.c:4915
    #4 0x502fcf in parse_branch /mnt/ram/x/oniguruma1/src/regparse.c:5241:7
    #5 0x50021d in parse_subexp /mnt/ram/x/oniguruma1/src/regparse.c:5284:7
    #6 0x4f6f78 in parse_regexp /mnt/ram/x/oniguruma1/src/regparse.c:5331:7
    #7 0x4f6f78 in onig_parse_make_tree /mnt/ram/x/oniguruma1/src/regparse.c:5362
    #8 0x51ed42 in onig_compile /mnt/ram/x/oniguruma1/src/regcomp.c:5283:7
    #9 0x544ac8 in onig_new /mnt/ram/x/oniguruma1/src/regcomp.c:5522:7
    #10 0x4f2320 in main /mnt/ram/x/oniguruma1/onig-useafterfree-onig_node_free.c:8:6
    #11 0x7f2b6da246ff in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r2/work/glibc-2.23/csu/../csu/libc-start.c:289
    #12 0x4198f8 in _start (/mnt/ram/x/oniguruma1/a.out+0x4198f8)

0x60600000eea0 is located 0 bytes inside of 56-byte region [0x60600000eea0,0x60600000eed8)
freed by thread T0 here:
    #0 0x4c1510 in __interceptor_free (/mnt/ram/x/oniguruma1/a.out+0x4c1510)
    #1 0x4f4712 in onig_node_free /mnt/ram/x/oniguruma1/src/regparse.c:1071:3
    #2 0x508ab6 in parse_enclose /mnt/ram/x/oniguruma1/src/regparse.c:4662:7
    #3 0x508ab6 in parse_exp /mnt/ram/x/oniguruma1/src/regparse.c:4915
    #4 0x502fcf in parse_branch /mnt/ram/x/oniguruma1/src/regparse.c:5241:7
    #5 0x50021d in parse_subexp /mnt/ram/x/oniguruma1/src/regparse.c:5284:7
    #6 0x4f6f78 in parse_regexp /mnt/ram/x/oniguruma1/src/regparse.c:5331:7
    #7 0x4f6f78 in onig_parse_make_tree /mnt/ram/x/oniguruma1/src/regparse.c:5362
    #8 0x51ed42 in onig_compile /mnt/ram/x/oniguruma1/src/regcomp.c:5283:7
    #9 0x544ac8 in onig_new /mnt/ram/x/oniguruma1/src/regcomp.c:5522:7
    #10 0x0  (<unknown module>)

previously allocated by thread T0 here:
    #0 0x4c1818 in __interceptor_malloc (/mnt/ram/x/oniguruma1/a.out+0x4c1818)
    #1 0x503823 in node_new /mnt/ram/x/oniguruma1/src/regparse.c:1079:18
    #2 0x503823 in node_new_str /mnt/ram/x/oniguruma1/src/regparse.c:1410
    #3 0x503823 in node_new_empty /mnt/ram/x/oniguruma1/src/regparse.c:1442
    #4 0x503823 in parse_exp /mnt/ram/x/oniguruma1/src/regparse.c:4910
    #5 0x502fcf in parse_branch /mnt/ram/x/oniguruma1/src/regparse.c:5241:7
    #6 0x5004b5 in parse_subexp /mnt/ram/x/oniguruma1/src/regparse.c:5299:11
    #7 0x508ab6 in parse_enclose /mnt/ram/x/oniguruma1/src/regparse.c:4662:7
    #8 0x508ab6 in parse_exp /mnt/ram/x/oniguruma1/src/regparse.c:4915
    #9 0x502fcf in parse_branch /mnt/ram/x/oniguruma1/src/regparse.c:5241:7
    #10 0x50021d in parse_subexp /mnt/ram/x/oniguruma1/src/regparse.c:5284:7
    #11 0x4f6f78 in parse_regexp /mnt/ram/x/oniguruma1/src/regparse.c:5331:7
    #12 0x4f6f78 in onig_parse_make_tree /mnt/ram/x/oniguruma1/src/regparse.c:5362
    #13 0x51ed42 in onig_compile /mnt/ram/x/oniguruma1/src/regcomp.c:5283:7
    #14 0x544ac8 in onig_new /mnt/ram/x/oniguruma1/src/regcomp.c:5522:7
    #15 0x0  (<unknown module>)

SUMMARY: AddressSanitizer: heap-use-after-free /mnt/ram/x/oniguruma1/src/regparse.c:1020:11 in onig_node_free
Shadow bytes around the buggy address:
  0x0c0c7fff9d80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9d90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9da0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9db0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9dc0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 fa
=>0x0c0c7fff9dd0: fa fa fa fa[fd]fd fd fd fd fd fd fa fa fa fa fa
  0x0c0c7fff9de0: fd fd fd fd fd fd fd fa fa fa fa fa fd fd fd fd
  0x0c0c7fff9df0: fd fd fd fa fa fa fa fa 00 00 00 00 00 00 00 fa
  0x0c0c7fff9e00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb

Heap corruption in next_state_val() due to uninitialized local variable

The following non-deterministic behavior can be triggered from the following code. With ASAN enabled, on 64-bit platform, the crash reproduces within 3-10 runs.

#include <stdio.h>
#include "oniguruma.h"

static int
exec(OnigEncoding enc, OnigOptionType options,
     char* apattern, char* astr,  int pattern_len,
    unsigned char *end, OnigSyntaxType* sytax)
{
    int r;
    regex_t* reg;
    OnigErrorInfo einfo;
    UChar* pattern = (UChar* )apattern;
    UChar* str     = (UChar* )astr;

    onig_initialize(&enc, 1);

    r = onig_new(&reg, pattern,
                 pattern + pattern_len,
                 options, enc, sytax , &einfo);
    if (r != ONIG_NORMAL) {
        char s[ONIG_MAX_ERROR_MESSAGE_LEN];
        onig_error_code_to_str(s, r, &einfo);
        fprintf(stderr, "ERROR: %s\n", s);
        return -1;
    }

    onig_free(reg);
    onig_end();
    return 0;
}

extern int main(int argc, char* argv[])
{
    int r;
    /* ISO 8859-1 test */
    static unsigned char str[] = { 0xc7, 0xd6, 0xfe, 0xea, 0xe0, 0xe2, 0x00 };

    char* pattern = "\x5b\x5c\x48\x2d\xb0\x30\x8d\x30\x2a\x5b\x5d\x20\x20\x5d"
        "\xf9\x54\x00\x7f\x5c\x63\xef\xef\xef\xef\x52\xf7\xf7\x52"
        "\xf7\xeb\xeb\x70\x2b\xf7\x7b\x30\x2c\x32\x7d";

    r = exec(ONIG_ENCODING_GB18030, ONIG_OPTION_IGNORECASE,pattern, (char*) str, 39, str + 7, ONIG_SYNTAX_DEFAULT);
    r = exec(ONIG_ENCODING_GB18030, ONIG_OPTION_IGNORECASE,pattern, (char*) str, 39, str + 7, ONIG_SYNTAX_DEFAULT);

    return r;
}

With some add-on:

static int
parse_char_class(Node** np, OnigToken* tok, UChar** src, UChar* end,
		 ScanEnv* env)
{
  int r, neg, len, fetched, and_start;
  OnigCodePoint v, vs;
  UChar *p;
  Node* node;
  CClassNode *cc, *prev_cc;
  CClassNode work_cc;

  printf("*vs (init) = %lu\n", (unsigned long)vs);
static void
bitset_set_range(BitSetRef bs, int from, int to)
{
  int i;
  for (i = from; i <= to && i < SINGLE_BYTE_SIZE; i++) {
    fprintf(stderr, "bs=%p, i=%lu\n", (unsigned int*)bs, i);
    BITSET_SET_BIT(bs, i);
  }
}

ASAN report:

*vs (init) = 2953750392
*vs (from) = 2953750392, v=2955971888
bs=0x60600000ef08, i=2953750392
ASAN:SIGSEGV
=================================================================
==22101==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x0000004030d0 bp 0x7ffe925592c0 sp 0x7ffe925592a0 T0)
    #0 0x4030cf in bitset_set_range /home/xie/Downloads/oni/onig-test-develop/src/regparse.c:203
    #1 0x419b72 in next_state_val /home/xie/Downloads/oni/onig-test-develop/src/regparse.c:4142
    #2 0x41ab77 in parse_char_class /home/xie/Downloads/oni/onig-test-develop/src/regparse.c:4329
    #3 0x41f7b6 in parse_exp /home/xie/Downloads/oni/onig-test-develop/src/regparse.c:5174
    #4 0x420728 in parse_branch /home/xie/Downloads/oni/onig-test-develop/src/regparse.c:5339
    #5 0x420bae in parse_subexp /home/xie/Downloads/oni/onig-test-develop/src/regparse.c:5385
    #6 0x420fc8 in parse_regexp /home/xie/Downloads/oni/onig-test-develop/src/regparse.c:5433
    #7 0x421454 in onig_parse_make_tree /home/xie/Downloads/oni/onig-test-develop/src/regparse.c:5464
    #8 0x43de89 in onig_compile /home/xie/Downloads/oni/onig-test-develop/src/regcomp.c:5326
    #9 0x43ed8b in onig_new /home/xie/Downloads/oni/onig-test-develop/src/regcomp.c:5565
    #10 0x4011d8 in exec /home/xie/Downloads/oni/onig-test-develop/test/testc.c:17
    #11 0x4013f6 in main /home/xie/Downloads/oni/onig-test-develop/test/testc.c:43
    #12 0x7f62aebd082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #13 0x400f98 in _start (/home/xie/Downloads/oni/onig-test-develop/test/testc+0x400f98)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/xie/Downloads/oni/onig-test-develop/src/regparse.c:203 bitset_set_range
==22101==ABORTING

The probabilistic reproducer triggers a heap OOB write when the local variable OnigCodePoint vs in parse_char_class() is not initialized, following the call as:

parse_char_class() -> next_state_val() -> bitset_set_range()

resulting in the said crash.

Note the calls to exec() is currently necessary to trigger the crash.

testcases print warnings

testc.c:857:11: warning: illegal character encoding in string literal [-Winvalid-source-encoding]
  x2("a<b><A5>ะก<BC><A5><B8><A5><E7><A5><F3><A4>ฮฅ<C0><A5><A6><A5><F3><A5><ED><A1><BC><A5><C9><\\/b>", "a<b><A5>ะก<BC><A5><B8><A5><E7><A5><F3><A4>ฮฅ<C0><A5><A6><A5><F3><A5><ED><A1><BC>...
          ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
testc.c:857:49: warning: illegal character encoding in string literal [-Winvalid-source-encoding]
  ..."a<b><A5>ะก<BC><A5><B8><A5><E7><A5><F3><A4>ฮฅ<C0><A5><A6><A5><F3><A5><ED><A1><BC><A5><C9></b>", 0, 32);
          ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
testc.c:858:11: warning: illegal character encoding in string literal [-Winvalid-source-encoding]
  x2(".<b><A5>ะก<BC><A5><B8><A5><E7><A5><F3><A4>ฮฅ<C0><A5><A6><A5><F3><A5><ED><A1><BC><A5><C9><\\/b>", "a<b><A5>ะก<BC><A5><B8><A5><E7><A5><F3><A4>ฮฅ<C0><A5><A6><A5><F3><A5><ED><A1><BC>...
          ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
testc.c:858:49: warning: illegal character encoding in string literal [-Winvalid-source-encoding]
  ..."a<b><A5>ะก<BC><A5><B8><A5><E7><A5><F3><A4>ฮฅ<C0><A5><A6><A5><F3><A5><ED><A1><BC><A5><C9></b>", 0, 32);
          ^~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
507 warnings generated.

Found on FreeBSD 11.

Positive lookahead doesn't look beyond the current line?

I want to make sure a character is followed by something that not necessarily on the current line. When that something is on next line, using (?=) doesn't work, as in

/(\/)(?=[^\/]*\/[ix]*)/

If /ix doesn't appear in the same line with the previous /, the regexp doesn't match.

If that's really the case, is there a way to make lookahead work across multiple lines?

Memory leaks after onig_new(), onig_free() when parsing regexp "\(("

For some inputs onig_new() allocates memory that doesn't get free'd, even if one subsequently calls onig_free on the created regexp. This makes fuzzing oniguruma harder, because a fuzzing process with a tool like libfuzzer will constantly grow in memory usage due to the leaks.

Example code:

#include <stdio.h>
#include <oniguruma.h>

int main()
{
    regex_t *reg;
    char foo[4] = "\((";
    if (onig_new
        (&reg, (const OnigUChar *)foo, (const OnigUChar *)(foo+3), ONIG_OPTION_DEFAULT, ONIG_ENCODING_UTF8,
         ONIG_SYNTAX_DEFAULT, 0) == 0) {
        printf("success, freeing regexp\n");
        onig_free(reg);
    }
    return 0;
}

If you run this with a memory leak detector (e.g. valgrind or newer versions of address sanitizer) it'll report leaked memory.

Here's a debugging output from asan:

==11769==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x4c1678 in __interceptor_malloc (/mnt/ram/a.out+0x4c1678)
    #1 0x4fe15a in node_new /mnt/ram/oniguruma/src/regparse.c:1072:18
    #2 0x4fe15a in node_new_enclose /mnt/ram/oniguruma/src/regparse.c:1278
    #3 0x4fe15a in node_new_enclose_memory /mnt/ram/oniguruma/src/regparse.c:1301
    #4 0x4fe15a in parse_enclose /mnt/ram/oniguruma/src/regparse.c:4639
    #5 0x4fe15a in parse_exp /mnt/ram/oniguruma/src/regparse.c:4899
    #6 0x4fd869 in parse_branch /mnt/ram/oniguruma/src/regparse.c:5222:7
    #7 0x4fb683 in parse_subexp /mnt/ram/oniguruma/src/regparse.c:5259:7
    #8 0x4f5349 in parse_regexp /mnt/ram/oniguruma/src/regparse.c:5304:7
    #9 0x4f5349 in onig_parse_make_tree /mnt/ram/oniguruma/src/regparse.c:5335
    #10 0x50f8c9 in onig_compile /mnt/ram/oniguruma/src/regcomp.c:5279:7
    #11 0x52a083 in onig_new /mnt/ram/oniguruma/src/regcomp.c:5518:7

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x4c1678 in __interceptor_malloc (/mnt/ram/a.out+0x4c1678)
    #1 0x4fe15a in node_new /mnt/ram/oniguruma/src/regparse.c:1072:18
    #2 0x4fe15a in node_new_enclose /mnt/ram/oniguruma/src/regparse.c:1278
    #3 0x4fe15a in node_new_enclose_memory /mnt/ram/oniguruma/src/regparse.c:1301
    #4 0x4fe15a in parse_enclose /mnt/ram/oniguruma/src/regparse.c:4639
    #5 0x4fe15a in parse_exp /mnt/ram/oniguruma/src/regparse.c:4899
    #6 0x4fd869 in parse_branch /mnt/ram/oniguruma/src/regparse.c:5222:7
    #7 0x4fb683 in parse_subexp /mnt/ram/oniguruma/src/regparse.c:5259:7
    #8 0x4fe326 in parse_enclose /mnt/ram/oniguruma/src/regparse.c:4649:7
    #9 0x4fe326 in parse_exp /mnt/ram/oniguruma/src/regparse.c:4899
    #10 0x4fd869 in parse_branch /mnt/ram/oniguruma/src/regparse.c:5222:7
    #11 0x4fb683 in parse_subexp /mnt/ram/oniguruma/src/regparse.c:5259:7
    #12 0x4f5349 in parse_regexp /mnt/ram/oniguruma/src/regparse.c:5304:7
    #13 0x4f5349 in onig_parse_make_tree /mnt/ram/oniguruma/src/regparse.c:5335
    #14 0x50f8c9 in onig_compile /mnt/ram/oniguruma/src/regcomp.c:5279:7
    #15 0x52a083 in onig_new /mnt/ram/oniguruma/src/regcomp.c:5518:7

SUMMARY: AddressSanitizer: 112 byte(s) leaked in 2 allocation(s).

use after free for regexp ()(?!(?'a')\1)

Example code:

#include <oniguruma.h>
int main() {
    regex_t *reg;
    const OnigUChar* inp = (const OnigUChar*)"()(?!(?'a')\\1)";
    if (onig_new
        (&reg, inp, inp+14, ONIG_OPTION_DEFAULT, ONIG_ENCODING_UTF8,
         ONIG_SYNTAX_DEFAULT, 0) == 0)
        onig_free(reg);
    return 0;
}

Compiling with asan will show a use after free error, see below. Latest develop branch, found with libfuzzer.

asan error:

==11794==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600000efc4 at pc 0x00000052f459 bp 0x7ffd5bb5a110 sp 0x7ffd5bb5a108
READ of size 4 at 0x60600000efc4 thread T0
    #0 0x52f458 in setup_tree /mnt/ram/oniguruma-develop-afl/src/regcomp.c:3731:9
    #1 0x52ade9 in setup_tree /mnt/ram/oniguruma-develop-afl/src/regcomp.c:3682:13
    #2 0x52e0e1 in setup_tree /mnt/ram/oniguruma-develop-afl/src/regcomp.c:3863:13
    #3 0x52ade9 in setup_tree /mnt/ram/oniguruma-develop-afl/src/regcomp.c:3682:13
    #4 0x5224ad in onig_compile /mnt/ram/oniguruma-develop-afl/src/regcomp.c:5318:7
    #5 0x547d38 in onig_new /mnt/ram/oniguruma-develop-afl/src/regcomp.c:5522:7
    #6 0x4f55a0 in main /mnt/ram/oniguruma-develop-afl/uaf.c:5:9
    #7 0x7f21b6f336ff in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r2/work/glibc-2.23/csu/../csu/libc-start.c:289
    #8 0x419df8 in _start (/mnt/ram/oniguruma-develop-afl/a.out+0x419df8)

0x60600000efc4 is located 4 bytes inside of 56-byte region [0x60600000efc0,0x60600000eff8)
freed by thread T0 here:
    #0 0x4c1a10 in __interceptor_free (/mnt/ram/oniguruma-develop-afl/a.out+0x4c1a10)
    #1 0x4f7992 in onig_node_free /mnt/ram/oniguruma-develop-afl/src/regparse.c:1071:3
    #2 0x522165 in onig_compile /mnt/ram/oniguruma-develop-afl/src/regcomp.c:5292:11
    #3 0x547d38 in onig_new /mnt/ram/oniguruma-develop-afl/src/regcomp.c:5522:7
    #4 0x0  (<unknown module>)

previously allocated by thread T0 here:
    #0 0x4c1d18 in __interceptor_malloc (/mnt/ram/oniguruma-develop-afl/a.out+0x4c1d18)
    #1 0x507101 in node_new /mnt/ram/oniguruma-develop-afl/src/regparse.c:1079:18
    #2 0x507101 in node_new_enclose /mnt/ram/oniguruma-develop-afl/src/regparse.c:1288
    #3 0x507101 in node_new_enclose_memory /mnt/ram/oniguruma-develop-afl/src/regparse.c:1311
    #4 0x507101 in parse_enclose /mnt/ram/oniguruma-develop-afl/src/regparse.c:4652
    #5 0x507101 in parse_exp /mnt/ram/oniguruma-develop-afl/src/regparse.c:4915
    #6 0x50625f in parse_branch /mnt/ram/oniguruma-develop-afl/src/regparse.c:5241:7
    #7 0x5034a2 in parse_subexp /mnt/ram/oniguruma-develop-afl/src/regparse.c:5284:7
    #8 0x4fa1f8 in parse_regexp /mnt/ram/oniguruma-develop-afl/src/regparse.c:5331:7
    #9 0x4fa1f8 in onig_parse_make_tree /mnt/ram/oniguruma-develop-afl/src/regparse.c:5362
    #10 0x521fd2 in onig_compile /mnt/ram/oniguruma-develop-afl/src/regcomp.c:5283:7
    #11 0x547d38 in onig_new /mnt/ram/oniguruma-develop-afl/src/regcomp.c:5522:7
    #12 0x0  (<unknown module>)

SUMMARY: AddressSanitizer: heap-use-after-free /mnt/ram/oniguruma-develop-afl/src/regcomp.c:3731:9 in setup_tree
Shadow bytes around the buggy address:
  0x0c0c7fff9da0: fa fa fa fa 00 00 00 00 00 00 00 fa fa fa fa fa
  0x0c0c7fff9db0: 00 00 00 00 00 00 00 fa fa fa fa fa 00 00 00 00
  0x0c0c7fff9dc0: 00 00 00 fa fa fa fa fa 00 00 00 00 00 00 00 fa
  0x0c0c7fff9dd0: fa fa fa fa 00 00 00 00 00 00 00 fa fa fa fa fa
  0x0c0c7fff9de0: 00 00 00 00 00 00 00 fa fa fa fa fa 00 00 00 00
=>0x0c0c7fff9df0: 00 00 00 fa fa fa fa fa[fd]fd fd fd fd fd fd fa
  0x0c0c7fff9e00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==11794==ABORTING

Buffer Overflow in match_at()

The buffer-overflow is found with the code:

#include <stdio.h>
#include "oniguruma.h"

static int
search(regex_t* reg, unsigned char* str, unsigned char* end)
{
  int r;
  unsigned char *start, *range;
  OnigRegion *region;

  region = onig_region_new();

  start = str;
  range = end;
  r = onig_search(reg, str, end, start, range, region, ONIG_OPTION_NONE);
  if (r >= 0) {
    int i;

    fprintf(stderr, "match at %d  (%s)\n", r,
            ONIGENC_NAME(onig_get_encoding(reg)));
    for (i = 0; i < region->num_regs; i++) {
      fprintf(stderr, "%d: (%d-%d)\n", i, region->beg[i], region->end[i]);
    }
  }
  else if (r == ONIG_MISMATCH) {
    fprintf(stderr, "search fail (%s)\n",
            ONIGENC_NAME(onig_get_encoding(reg)));
  }
  else { /* error */
    char s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str(s, r);
    fprintf(stderr, "ERROR: %s\n", s);
    fprintf(stderr, "  (%s)\n", ONIGENC_NAME(onig_get_encoding(reg)));
    return -1;
  }

  onig_region_free(region, 1 /* 1:free self, 0:free contents only */);
  return 0;
}
static int
exec(OnigEncoding enc, OnigOptionType options,
     char* apattern, char* apttern_end, char* astr, char* end)
{
  int r;
  regex_t* reg;
  OnigErrorInfo einfo;
  UChar* pattern = (UChar* )apattern;
  UChar* str     = (UChar* )astr;

  onig_initialize(&enc, 1);

  r = onig_new(&reg, pattern,
               apttern_end,
               options, enc, ONIG_SYNTAX_DEFAULT, &einfo);
  if (r != ONIG_NORMAL) {
    char s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str(s, r, &einfo);
    fprintf(stderr, "ERROR: %s\n", s);
    return -1;
  }

  r = search(reg, str, end);

  onig_free(reg);
  onig_end();
  return 0;
}
int main() {
    regex_t *reg;
    unsigned char str[] = { 0xc7, 0xd6, 0xfe, 0xea, 0xe0, 0xe2, 0x00 };
    unsigned char input[] = {0xf1, 0x5c, 0x69, 0x53, 0x53, 0x53, 0x53, 0x3c, 0x30, 0x53, 
	0x59, 0x54, 0x52, 0x33, 0x7c, 0x2e, 0x5c, 0xe2, 0x48, 0x5c, 0x7a, 0x53, 0x00, 
	0x06, 0x00, 0x00, 0x27, 0x19, 0x00, 0x54, 0x52, 0x54, 0x52, 0x33, 0x7c, 0x2e, 0x53, 0xe2, 0x48};
    int r = exec( ONIG_ENCODING_SJIS ,ONIG_OPTION_IGNORECASE , (char*)input,input+39, (char*) str,str+7 );
        return 0;
}

The asan err can be found as follows:

==4064==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdd18d5447 at pc 0x00000044434c bp 0x7ffdd18d4910 sp 0x7ffdd18d4900
READ of size 1 at 0x7ffdd18d5447 thread T0
    #0 0x44434b in match_at /home/xie/Downloads/oni/oni-asan-dev/src/regexec.c:1481
    #1 0x457fdf in onig_search /home/xie/Downloads/oni/oni-asan-dev/src/regexec.c:3651
    #2 0x401148 in search /home/xie/Downloads/oni/oni-asan-dev/test/testc.c:18
    #3 0x401786 in exec /home/xie/Downloads/oni/oni-asan-dev/test/testc.c:65
    #4 0x401a02 in main /home/xie/Downloads/oni/oni-asan-dev/test/testc.c:77
    #5 0x7f9bd023582f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #6 0x400f68 in _start (/home/xie/Downloads/oni/oni-asan-dev/test/testc+0x400f68)

Address 0x7ffdd18d5447 is located in stack of thread T0 at offset 39 in frame
    #0 0x40186f in main /home/xie/Downloads/oni/oni-asan-dev/test/testc.c:71

  This frame has 2 object(s):
    [32, 39) 'str' <== Memory access at offset 39 overflows this variable
    [96, 135) 'input'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/xie/Downloads/oni/oni-asan-dev/src/regexec.c:1481 match_at
Shadow bytes around the buggy address:
  0x10003a312a30: 00 00 00 00 00 00 00 00 00 00 00 02 f3 f3 f3 f3
  0x10003a312a40: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x10003a312a50: 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2
  0x10003a312a60: 00 00 00 f4 f2 f2 f2 f2 00 00 00 00 00 00 00 00
  0x10003a312a70: 00 00 00 02 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00
=>0x10003a312a80: 00 00 00 00 f1 f1 f1 f1[07]f4 f4 f4 f2 f2 f2 f2
  0x10003a312a90: 00 00 00 00 07 f4 f4 f4 f3 f3 f3 f3 00 00 00 00
  0x10003a312aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10003a312ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10003a312ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10003a312ad0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
==4064==ABORTING

Out of bounds read in onig_strcpy()

Passing a single byte with the value 0xD8 will cause an out of bounds read access.
This was found with libfuzzer and address sanitizer, tested against develop branch of oniguruma.

Example code:

#include <oniguruma.h>
int main()
{
    regex_t *reg;
    unsigned char inp[1] = { 0xd8 };

    onig_new(&reg, inp, inp + 1, ONIG_OPTION_DEFAULT,
         ONIG_ENCODING_UTF8, ONIG_SYNTAX_DEFAULT, 0);
}

Address Sanitizer stack trace:

==15218==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffea0eeb01 at pc 0x0000004ab6d5 bp 0x7fffea0edd70 sp 0x7fffea0ed520
READ of size 2 at 0x7fffea0eeb01 thread T0
    #0 0x4ab6d4 in __asan_memcpy (/mnt/ram/a.out+0x4ab6d4)
    #1 0x4f55e5 in onig_strcpy /mnt/ram/oniguruma/src/regparse.c:231:5
    #2 0x4f55e5 in onig_node_str_cat /mnt/ram/oniguruma/src/regparse.c:1350
    #3 0x506ee4 in node_new_str /mnt/ram/oniguruma/src/regparse.c:1409:7
    #4 0x506ee4 in parse_exp /mnt/ram/oniguruma/src/regparse.c:4912
    #5 0x505891 in parse_branch /mnt/ram/oniguruma/src/regparse.c:5206:7
    #6 0x50189a in parse_subexp /mnt/ram/oniguruma/src/regparse.c:5243:7
    #7 0x4f8040 in parse_regexp /mnt/ram/oniguruma/src/regparse.c:5288:7
    #8 0x4f8040 in onig_parse_make_tree /mnt/ram/oniguruma/src/regparse.c:5315
    #9 0x528d41 in onig_compile /mnt/ram/oniguruma/src/regcomp.c:5279:7
    #10 0x551d0f in onig_new /mnt/ram/oniguruma/src/regcomp.c:5518:7
    #11 0x4f21dc in main /mnt/ram/oob.c:7:5
    #12 0x7fa84b0a978f in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r2/work/glibc-2.23/csu/../csu/libc-start.c:289
    #13 0x419708 in _start (/mnt/ram/a.out+0x419708)

Address 0x7fffea0eeb01 is located in stack of thread T0 at offset 65 in frame
    #0 0x4f205f in main /mnt/ram/oob.c:3

  This frame has 2 object(s):
    [32, 40) 'reg'
    [64, 65) 'inp' <== Memory access at offset 65 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (/mnt/ram/a.out+0x4ab6d4) in __asan_memcpy
Shadow bytes around the buggy address:
  0x10007d415d10: 00 00 00 f2 f2 f2 f2 f2 f2 f2 f2 f2 00 00 f3 f3
  0x10007d415d20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007d415d30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007d415d40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007d415d50: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2
=>0x10007d415d60:[01]f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007d415d70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007d415d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007d415d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007d415da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007d415db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==15218==ABORTING

ONIG_OPTION_FIND_LONGEST both interprets regex to be greedy and skips shorter matches, need to only be greedy but not skip

Current behavior of ONIG_OPTION_FIND_LONGEST causes two effects:

  • regex is matched in a "greedy" way (the longest possible match is found at every location)
  • shorter matches are skipped by onig_match, so that it returns only the longest matches (multiples are allowed)

I need it to not skip anything, but match in a greedy fashion. Is this currently possible?
If this is not possible, I would like to suggest to split ONIG_OPTION_FIND_LONGEST into two new options: ONIG_OPTION_GREEDY and ONIG_OPTION_ONLY_LONGEST. It's best to be able to specify different bits of behavior separately.

Request back port of issue #35 fix into 5.9.6

I sincerely appreciate the effort that went in to resolve issue #35 but I was wondering if this could be backported into 5.9.6? The specific project that requires oniguruma is currently configured to use version 5.9.6 but I will be letting that project maintainer know that 6.1.3 is out.

Cheers!

Autoconf setup needs help

  • configure.in should be named configure.ac.

  • The files in m4/ should not be committed (but a m4/.whatever file must be included as otherwise m4/ would be empty, and you need it present in order to have autoreconf -fi work).

  • autoreconf -fi fails, but this patch helps:

    -AM_INIT_AUTOMAKE
    +AM_INIT_AUTOMAKE([-Wno-portability 1.14])

  • Makefile.am and sample/Makefile.am use INCLUDES instead of AM_CPPFLAGS.

  • There's some other automake error I haven't yet worked out.

/bin/sh: Syntax error: redirection unexpected (expecting word)

Oniguruma seems to install just fine via package but when trying to compile from source, or install Python modules that require this, it fails.

$ uname -a
FreeBSD microbox 11.0-RELEASE-p1 FreeBSD 11.0-RELEASE-p1 #0 r306420: Thu Sep 29 01:43:23 UTC 2016     [email protected]:/usr/obj/usr/src/sys/GENERIC  amd64
$ gcc --version
gcc (FreeBSD Ports Collection) 4.8.5
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# from https://github.com/kkos/oniguruma/releases/download/v5.9.6/onig-5.9.6.tar.gz
$ ./configure CFLAGS=-fPIC
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... ./install-sh -c -d
checking for gawk... no
checking for mawk... no
checking for nawk... nawk
checking whether make sets $(MAKE)... yes
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking build system type... x86_64-unknown-freebsd11.0
checking host system type... x86_64-unknown-freebsd11.0
checking how to print strings... printf
checking for a sed that does not truncate output... /usr/bin/sed
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for fgrep... /usr/bin/grep -F
checking for ld used by gcc... /usr/local/bin/ld
checking if the linker (/usr/local/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/local/bin/nm -B
checking the name lister (/usr/local/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 196608
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... no
checking how to convert x86_64-unknown-freebsd11.0 file names to x86_64-unknown-freebsd11.0 format... func_convert_file_noop
checking how to convert x86_64-unknown-freebsd11.0 file names to toolchain format... func_convert_file_noop
checking for /usr/local/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for ar... ar
checking for archiver @FILE support... no
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/local/bin/nm -B output from gcc object... ok
checking for sysroot... no
checking for mt... mt
checking if mt is a manifest tool... no
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... yes
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/local/bin/ld) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... freebsd11.0 ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... no
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... yes
checking whether make sets $(MAKE)... (cached) yes
checking for ANSI C header files... (cached) yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking for strings.h... (cached) yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for unistd.h... (cached) yes
checking sys/times.h usability... yes
checking sys/times.h presence... yes
checking for sys/times.h... yes
checking size of int... 4
checking size of short... 2
checking size of long... 8
checking for an ANSI C-conforming const... yes
checking whether time.h and sys/time.h may both be included... yes
checking for size_t... yes
checking for working alloca.h... no
checking for alloca... yes
checking for working memcmp... yes
checking for prototypes... yes
checking for variable length prototypes and stdarg.h... yes
configure: creating ./config.status
config.status: creating Makefile
config.status: creating onig-config
config.status: creating sample/Makefile
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands
config.status: executing default commands
$ make
make  all-recursive
Making all in .
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regerror.lo -MD -MP -MF .deps/regerror.Tpo -c -o regerror.lo regerror.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regerror.lo -MD -MP -MF .deps/regerror.Tpo -c regerror.c  -fPIC -DPIC -o .libs/regerror.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regerror.lo -MD -MP -MF .deps/regerror.Tpo -c regerror.c -o regerror.o >/dev/null 2>&1
mv -f .deps/regerror.Tpo .deps/regerror.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regparse.lo -MD -MP -MF .deps/regparse.Tpo -c -o regparse.lo regparse.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regparse.lo -MD -MP -MF .deps/regparse.Tpo -c regparse.c  -fPIC -DPIC -o .libs/regparse.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regparse.lo -MD -MP -MF .deps/regparse.Tpo -c regparse.c -o regparse.o >/dev/null 2>&1
mv -f .deps/regparse.Tpo .deps/regparse.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regext.lo -MD -MP -MF .deps/regext.Tpo -c -o regext.lo regext.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regext.lo -MD -MP -MF .deps/regext.Tpo -c regext.c  -fPIC -DPIC -o .libs/regext.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regext.lo -MD -MP -MF .deps/regext.Tpo -c regext.c -o regext.o >/dev/null 2>&1
mv -f .deps/regext.Tpo .deps/regext.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regcomp.lo -MD -MP -MF .deps/regcomp.Tpo -c -o regcomp.lo regcomp.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regcomp.lo -MD -MP -MF .deps/regcomp.Tpo -c regcomp.c  -fPIC -DPIC -o .libs/regcomp.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regcomp.lo -MD -MP -MF .deps/regcomp.Tpo -c regcomp.c -o regcomp.o >/dev/null 2>&1
mv -f .deps/regcomp.Tpo .deps/regcomp.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regexec.lo -MD -MP -MF .deps/regexec.Tpo -c -o regexec.lo regexec.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regexec.lo -MD -MP -MF .deps/regexec.Tpo -c regexec.c  -fPIC -DPIC -o .libs/regexec.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regexec.lo -MD -MP -MF .deps/regexec.Tpo -c regexec.c -o regexec.o >/dev/null 2>&1
mv -f .deps/regexec.Tpo .deps/regexec.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT reggnu.lo -MD -MP -MF .deps/reggnu.Tpo -c -o reggnu.lo reggnu.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT reggnu.lo -MD -MP -MF .deps/reggnu.Tpo -c reggnu.c  -fPIC -DPIC -o .libs/reggnu.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT reggnu.lo -MD -MP -MF .deps/reggnu.Tpo -c reggnu.c -o reggnu.o >/dev/null 2>&1
mv -f .deps/reggnu.Tpo .deps/reggnu.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regenc.lo -MD -MP -MF .deps/regenc.Tpo -c -o regenc.lo regenc.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regenc.lo -MD -MP -MF .deps/regenc.Tpo -c regenc.c  -fPIC -DPIC -o .libs/regenc.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regenc.lo -MD -MP -MF .deps/regenc.Tpo -c regenc.c -o regenc.o >/dev/null 2>&1
mv -f .deps/regenc.Tpo .deps/regenc.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regsyntax.lo -MD -MP -MF .deps/regsyntax.Tpo -c -o regsyntax.lo regsyntax.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regsyntax.lo -MD -MP -MF .deps/regsyntax.Tpo -c regsyntax.c  -fPIC -DPIC -o .libs/regsyntax.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regsyntax.lo -MD -MP -MF .deps/regsyntax.Tpo -c regsyntax.c -o regsyntax.o >/dev/null 2>&1
mv -f .deps/regsyntax.Tpo .deps/regsyntax.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regtrav.lo -MD -MP -MF .deps/regtrav.Tpo -c -o regtrav.lo regtrav.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regtrav.lo -MD -MP -MF .deps/regtrav.Tpo -c regtrav.c  -fPIC -DPIC -o .libs/regtrav.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regtrav.lo -MD -MP -MF .deps/regtrav.Tpo -c regtrav.c -o regtrav.o >/dev/null 2>&1
mv -f .deps/regtrav.Tpo .deps/regtrav.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regversion.lo -MD -MP -MF .deps/regversion.Tpo -c -o regversion.lo regversion.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regversion.lo -MD -MP -MF .deps/regversion.Tpo -c regversion.c  -fPIC -DPIC -o .libs/regversion.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regversion.lo -MD -MP -MF .deps/regversion.Tpo -c regversion.c -o regversion.o >/dev/null 2>&1
mv -f .deps/regversion.Tpo .deps/regversion.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT st.lo -MD -MP -MF .deps/st.Tpo -c -o st.lo st.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT st.lo -MD -MP -MF .deps/st.Tpo -c st.c  -fPIC -DPIC -o .libs/st.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT st.lo -MD -MP -MF .deps/st.Tpo -c st.c -o st.o >/dev/null 2>&1
mv -f .deps/st.Tpo .deps/st.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regposix.lo -MD -MP -MF .deps/regposix.Tpo -c -o regposix.lo regposix.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regposix.lo -MD -MP -MF .deps/regposix.Tpo -c regposix.c  -fPIC -DPIC -o .libs/regposix.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regposix.lo -MD -MP -MF .deps/regposix.Tpo -c regposix.c -o regposix.o >/dev/null 2>&1
mv -f .deps/regposix.Tpo .deps/regposix.Plo
/bin/sh ./libtool --tag=CC    --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include     -fPIC -MT regposerr.lo -MD -MP -MF .deps/regposerr.Tpo -c -o regposerr.lo regposerr.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regposerr.lo -MD -MP -MF .deps/regposerr.Tpo -c regposerr.c  -fPIC -DPIC -o .libs/regposerr.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT regposerr.lo -MD -MP -MF .deps/regposerr.Tpo -c regposerr.c -o regposerr.o >/dev/null 2>&1
mv -f .deps/regposerr.Tpo .deps/regposerr.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT unicode.lo -MD -MP -MF .deps/unicode.Tpo -c -o unicode.lo `test -f './enc/unicode.c' || echo './'`./enc/unicode.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT unicode.lo -MD -MP -MF .deps/unicode.Tpo -c ./enc/unicode.c  -fPIC -DPIC -o .libs/unicode.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT unicode.lo -MD -MP -MF .deps/unicode.Tpo -c ./enc/unicode.c -o unicode.o >/dev/null 2>&1
mv -f .deps/unicode.Tpo .deps/unicode.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT ascii.lo -MD -MP -MF .deps/ascii.Tpo -c -o ascii.lo `test -f './enc/ascii.c' || echo './'`./enc/ascii.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT ascii.lo -MD -MP -MF .deps/ascii.Tpo -c ./enc/ascii.c  -fPIC -DPIC -o .libs/ascii.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT ascii.lo -MD -MP -MF .deps/ascii.Tpo -c ./enc/ascii.c -o ascii.o >/dev/null 2>&1
mv -f .deps/ascii.Tpo .deps/ascii.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT utf8.lo -MD -MP -MF .deps/utf8.Tpo -c -o utf8.lo `test -f './enc/utf8.c' || echo './'`./enc/utf8.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT utf8.lo -MD -MP -MF .deps/utf8.Tpo -c ./enc/utf8.c  -fPIC -DPIC -o .libs/utf8.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT utf8.lo -MD -MP -MF .deps/utf8.Tpo -c ./enc/utf8.c -o utf8.o >/dev/null 2>&1
mv -f .deps/utf8.Tpo .deps/utf8.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT utf16_be.lo -MD -MP -MF .deps/utf16_be.Tpo -c -o utf16_be.lo `test -f './enc/utf16_be.c' || echo './'`./enc/utf16_be.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT utf16_be.lo -MD -MP -MF .deps/utf16_be.Tpo -c ./enc/utf16_be.c  -fPIC -DPIC -o .libs/utf16_be.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT utf16_be.lo -MD -MP -MF .deps/utf16_be.Tpo -c ./enc/utf16_be.c -o utf16_be.o >/dev/null 2>&1
mv -f .deps/utf16_be.Tpo .deps/utf16_be.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT utf16_le.lo -MD -MP -MF .deps/utf16_le.Tpo -c -o utf16_le.lo `test -f './enc/utf16_le.c' || echo './'`./enc/utf16_le.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT utf16_le.lo -MD -MP -MF .deps/utf16_le.Tpo -c ./enc/utf16_le.c  -fPIC -DPIC -o .libs/utf16_le.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT utf16_le.lo -MD -MP -MF .deps/utf16_le.Tpo -c ./enc/utf16_le.c -o utf16_le.o >/dev/null 2>&1
mv -f .deps/utf16_le.Tpo .deps/utf16_le.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT utf32_be.lo -MD -MP -MF .deps/utf32_be.Tpo -c -o utf32_be.lo `test -f './enc/utf32_be.c' || echo './'`./enc/utf32_be.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT utf32_be.lo -MD -MP -MF .deps/utf32_be.Tpo -c ./enc/utf32_be.c  -fPIC -DPIC -o .libs/utf32_be.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT utf32_be.lo -MD -MP -MF .deps/utf32_be.Tpo -c ./enc/utf32_be.c -o utf32_be.o >/dev/null 2>&1
mv -f .deps/utf32_be.Tpo .deps/utf32_be.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT utf32_le.lo -MD -MP -MF .deps/utf32_le.Tpo -c -o utf32_le.lo `test -f './enc/utf32_le.c' || echo './'`./enc/utf32_le.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT utf32_le.lo -MD -MP -MF .deps/utf32_le.Tpo -c ./enc/utf32_le.c  -fPIC -DPIC -o .libs/utf32_le.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT utf32_le.lo -MD -MP -MF .deps/utf32_le.Tpo -c ./enc/utf32_le.c -o utf32_le.o >/dev/null 2>&1
mv -f .deps/utf32_le.Tpo .deps/utf32_le.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT euc_jp.lo -MD -MP -MF .deps/euc_jp.Tpo -c -o euc_jp.lo `test -f './enc/euc_jp.c' || echo './'`./enc/euc_jp.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT euc_jp.lo -MD -MP -MF .deps/euc_jp.Tpo -c ./enc/euc_jp.c  -fPIC -DPIC -o .libs/euc_jp.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT euc_jp.lo -MD -MP -MF .deps/euc_jp.Tpo -c ./enc/euc_jp.c -o euc_jp.o >/dev/null 2>&1
mv -f .deps/euc_jp.Tpo .deps/euc_jp.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT sjis.lo -MD -MP -MF .deps/sjis.Tpo -c -o sjis.lo `test -f './enc/sjis.c' || echo './'`./enc/sjis.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT sjis.lo -MD -MP -MF .deps/sjis.Tpo -c ./enc/sjis.c  -fPIC -DPIC -o .libs/sjis.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT sjis.lo -MD -MP -MF .deps/sjis.Tpo -c ./enc/sjis.c -o sjis.o >/dev/null 2>&1
mv -f .deps/sjis.Tpo .deps/sjis.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_1.lo -MD -MP -MF .deps/iso8859_1.Tpo -c -o iso8859_1.lo `test -f './enc/iso8859_1.c' || echo './'`./enc/iso8859_1.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_1.lo -MD -MP -MF .deps/iso8859_1.Tpo -c ./enc/iso8859_1.c  -fPIC -DPIC -o .libs/iso8859_1.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_1.lo -MD -MP -MF .deps/iso8859_1.Tpo -c ./enc/iso8859_1.c -o iso8859_1.o >/dev/null 2>&1
mv -f .deps/iso8859_1.Tpo .deps/iso8859_1.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_2.lo -MD -MP -MF .deps/iso8859_2.Tpo -c -o iso8859_2.lo `test -f './enc/iso8859_2.c' || echo './'`./enc/iso8859_2.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_2.lo -MD -MP -MF .deps/iso8859_2.Tpo -c ./enc/iso8859_2.c  -fPIC -DPIC -o .libs/iso8859_2.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_2.lo -MD -MP -MF .deps/iso8859_2.Tpo -c ./enc/iso8859_2.c -o iso8859_2.o >/dev/null 2>&1
mv -f .deps/iso8859_2.Tpo .deps/iso8859_2.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_3.lo -MD -MP -MF .deps/iso8859_3.Tpo -c -o iso8859_3.lo `test -f './enc/iso8859_3.c' || echo './'`./enc/iso8859_3.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_3.lo -MD -MP -MF .deps/iso8859_3.Tpo -c ./enc/iso8859_3.c  -fPIC -DPIC -o .libs/iso8859_3.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_3.lo -MD -MP -MF .deps/iso8859_3.Tpo -c ./enc/iso8859_3.c -o iso8859_3.o >/dev/null 2>&1
mv -f .deps/iso8859_3.Tpo .deps/iso8859_3.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_4.lo -MD -MP -MF .deps/iso8859_4.Tpo -c -o iso8859_4.lo `test -f './enc/iso8859_4.c' || echo './'`./enc/iso8859_4.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_4.lo -MD -MP -MF .deps/iso8859_4.Tpo -c ./enc/iso8859_4.c  -fPIC -DPIC -o .libs/iso8859_4.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_4.lo -MD -MP -MF .deps/iso8859_4.Tpo -c ./enc/iso8859_4.c -o iso8859_4.o >/dev/null 2>&1
mv -f .deps/iso8859_4.Tpo .deps/iso8859_4.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_5.lo -MD -MP -MF .deps/iso8859_5.Tpo -c -o iso8859_5.lo `test -f './enc/iso8859_5.c' || echo './'`./enc/iso8859_5.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_5.lo -MD -MP -MF .deps/iso8859_5.Tpo -c ./enc/iso8859_5.c  -fPIC -DPIC -o .libs/iso8859_5.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_5.lo -MD -MP -MF .deps/iso8859_5.Tpo -c ./enc/iso8859_5.c -o iso8859_5.o >/dev/null 2>&1
mv -f .deps/iso8859_5.Tpo .deps/iso8859_5.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_6.lo -MD -MP -MF .deps/iso8859_6.Tpo -c -o iso8859_6.lo `test -f './enc/iso8859_6.c' || echo './'`./enc/iso8859_6.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_6.lo -MD -MP -MF .deps/iso8859_6.Tpo -c ./enc/iso8859_6.c  -fPIC -DPIC -o .libs/iso8859_6.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_6.lo -MD -MP -MF .deps/iso8859_6.Tpo -c ./enc/iso8859_6.c -o iso8859_6.o >/dev/null 2>&1
mv -f .deps/iso8859_6.Tpo .deps/iso8859_6.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_7.lo -MD -MP -MF .deps/iso8859_7.Tpo -c -o iso8859_7.lo `test -f './enc/iso8859_7.c' || echo './'`./enc/iso8859_7.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_7.lo -MD -MP -MF .deps/iso8859_7.Tpo -c ./enc/iso8859_7.c  -fPIC -DPIC -o .libs/iso8859_7.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_7.lo -MD -MP -MF .deps/iso8859_7.Tpo -c ./enc/iso8859_7.c -o iso8859_7.o >/dev/null 2>&1
mv -f .deps/iso8859_7.Tpo .deps/iso8859_7.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_8.lo -MD -MP -MF .deps/iso8859_8.Tpo -c -o iso8859_8.lo `test -f './enc/iso8859_8.c' || echo './'`./enc/iso8859_8.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_8.lo -MD -MP -MF .deps/iso8859_8.Tpo -c ./enc/iso8859_8.c  -fPIC -DPIC -o .libs/iso8859_8.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_8.lo -MD -MP -MF .deps/iso8859_8.Tpo -c ./enc/iso8859_8.c -o iso8859_8.o >/dev/null 2>&1
mv -f .deps/iso8859_8.Tpo .deps/iso8859_8.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_9.lo -MD -MP -MF .deps/iso8859_9.Tpo -c -o iso8859_9.lo `test -f './enc/iso8859_9.c' || echo './'`./enc/iso8859_9.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_9.lo -MD -MP -MF .deps/iso8859_9.Tpo -c ./enc/iso8859_9.c  -fPIC -DPIC -o .libs/iso8859_9.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_9.lo -MD -MP -MF .deps/iso8859_9.Tpo -c ./enc/iso8859_9.c -o iso8859_9.o >/dev/null 2>&1
mv -f .deps/iso8859_9.Tpo .deps/iso8859_9.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_10.lo -MD -MP -MF .deps/iso8859_10.Tpo -c -o iso8859_10.lo `test -f './enc/iso8859_10.c' || echo './'`./enc/iso8859_10.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_10.lo -MD -MP -MF .deps/iso8859_10.Tpo -c ./enc/iso8859_10.c  -fPIC -DPIC -o .libs/iso8859_10.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_10.lo -MD -MP -MF .deps/iso8859_10.Tpo -c ./enc/iso8859_10.c -o iso8859_10.o >/dev/null 2>&1
mv -f .deps/iso8859_10.Tpo .deps/iso8859_10.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_11.lo -MD -MP -MF .deps/iso8859_11.Tpo -c -o iso8859_11.lo `test -f './enc/iso8859_11.c' || echo './'`./enc/iso8859_11.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_11.lo -MD -MP -MF .deps/iso8859_11.Tpo -c ./enc/iso8859_11.c  -fPIC -DPIC -o .libs/iso8859_11.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_11.lo -MD -MP -MF .deps/iso8859_11.Tpo -c ./enc/iso8859_11.c -o iso8859_11.o >/dev/null 2>&1
mv -f .deps/iso8859_11.Tpo .deps/iso8859_11.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_13.lo -MD -MP -MF .deps/iso8859_13.Tpo -c -o iso8859_13.lo `test -f './enc/iso8859_13.c' || echo './'`./enc/iso8859_13.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_13.lo -MD -MP -MF .deps/iso8859_13.Tpo -c ./enc/iso8859_13.c  -fPIC -DPIC -o .libs/iso8859_13.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_13.lo -MD -MP -MF .deps/iso8859_13.Tpo -c ./enc/iso8859_13.c -o iso8859_13.o >/dev/null 2>&1
mv -f .deps/iso8859_13.Tpo .deps/iso8859_13.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_14.lo -MD -MP -MF .deps/iso8859_14.Tpo -c -o iso8859_14.lo `test -f './enc/iso8859_14.c' || echo './'`./enc/iso8859_14.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_14.lo -MD -MP -MF .deps/iso8859_14.Tpo -c ./enc/iso8859_14.c  -fPIC -DPIC -o .libs/iso8859_14.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_14.lo -MD -MP -MF .deps/iso8859_14.Tpo -c ./enc/iso8859_14.c -o iso8859_14.o >/dev/null 2>&1
mv -f .deps/iso8859_14.Tpo .deps/iso8859_14.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_15.lo -MD -MP -MF .deps/iso8859_15.Tpo -c -o iso8859_15.lo `test -f './enc/iso8859_15.c' || echo './'`./enc/iso8859_15.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_15.lo -MD -MP -MF .deps/iso8859_15.Tpo -c ./enc/iso8859_15.c  -fPIC -DPIC -o .libs/iso8859_15.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_15.lo -MD -MP -MF .deps/iso8859_15.Tpo -c ./enc/iso8859_15.c -o iso8859_15.o >/dev/null 2>&1
mv -f .deps/iso8859_15.Tpo .deps/iso8859_15.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT iso8859_16.lo -MD -MP -MF .deps/iso8859_16.Tpo -c -o iso8859_16.lo `test -f './enc/iso8859_16.c' || echo './'`./enc/iso8859_16.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_16.lo -MD -MP -MF .deps/iso8859_16.Tpo -c ./enc/iso8859_16.c  -fPIC -DPIC -o .libs/iso8859_16.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT iso8859_16.lo -MD -MP -MF .deps/iso8859_16.Tpo -c ./enc/iso8859_16.c -o iso8859_16.o >/dev/null 2>&1
mv -f .deps/iso8859_16.Tpo .deps/iso8859_16.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT euc_tw.lo -MD -MP -MF .deps/euc_tw.Tpo -c -o euc_tw.lo `test -f './enc/euc_tw.c' || echo './'`./enc/euc_tw.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT euc_tw.lo -MD -MP -MF .deps/euc_tw.Tpo -c ./enc/euc_tw.c  -fPIC -DPIC -o .libs/euc_tw.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT euc_tw.lo -MD -MP -MF .deps/euc_tw.Tpo -c ./enc/euc_tw.c -o euc_tw.o >/dev/null 2>&1
mv -f .deps/euc_tw.Tpo .deps/euc_tw.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT euc_kr.lo -MD -MP -MF .deps/euc_kr.Tpo -c -o euc_kr.lo `test -f './enc/euc_kr.c' || echo './'`./enc/euc_kr.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT euc_kr.lo -MD -MP -MF .deps/euc_kr.Tpo -c ./enc/euc_kr.c  -fPIC -DPIC -o .libs/euc_kr.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT euc_kr.lo -MD -MP -MF .deps/euc_kr.Tpo -c ./enc/euc_kr.c -o euc_kr.o >/dev/null 2>&1
mv -f .deps/euc_kr.Tpo .deps/euc_kr.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT big5.lo -MD -MP -MF .deps/big5.Tpo -c -o big5.lo `test -f './enc/big5.c' || echo './'`./enc/big5.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT big5.lo -MD -MP -MF .deps/big5.Tpo -c ./enc/big5.c  -fPIC -DPIC -o .libs/big5.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT big5.lo -MD -MP -MF .deps/big5.Tpo -c ./enc/big5.c -o big5.o >/dev/null 2>&1
mv -f .deps/big5.Tpo .deps/big5.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT gb18030.lo -MD -MP -MF .deps/gb18030.Tpo -c -o gb18030.lo `test -f './enc/gb18030.c' || echo './'`./enc/gb18030.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT gb18030.lo -MD -MP -MF .deps/gb18030.Tpo -c ./enc/gb18030.c  -fPIC -DPIC -o .libs/gb18030.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT gb18030.lo -MD -MP -MF .deps/gb18030.Tpo -c ./enc/gb18030.c -o gb18030.o >/dev/null 2>&1
mv -f .deps/gb18030.Tpo .deps/gb18030.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT koi8_r.lo -MD -MP -MF .deps/koi8_r.Tpo -c -o koi8_r.lo `test -f './enc/koi8_r.c' || echo './'`./enc/koi8_r.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT koi8_r.lo -MD -MP -MF .deps/koi8_r.Tpo -c ./enc/koi8_r.c  -fPIC -DPIC -o .libs/koi8_r.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT koi8_r.lo -MD -MP -MF .deps/koi8_r.Tpo -c ./enc/koi8_r.c -o koi8_r.o >/dev/null 2>&1
mv -f .deps/koi8_r.Tpo .deps/koi8_r.Plo
/bin/sh ./libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include    -fPIC -MT cp1251.lo -MD -MP -MF .deps/cp1251.Tpo -c -o cp1251.lo `test -f './enc/cp1251.c' || echo './'`./enc/cp1251.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT cp1251.lo -MD -MP -MF .deps/cp1251.Tpo -c ./enc/cp1251.c  -fPIC -DPIC -o .libs/cp1251.o
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I. -I/usr/local/include -fPIC -MT cp1251.lo -MD -MP -MF .deps/cp1251.Tpo -c ./enc/cp1251.c -o cp1251.o >/dev/null 2>&1
mv -f .deps/cp1251.Tpo .deps/cp1251.Plo
/bin/sh ./libtool --tag=CC    --mode=link gcc  -fPIC  -version-info 2:0:0  -o libonig.la -rpath /usr/local/lib regerror.lo regparse.lo regext.lo regcomp.lo  regexec.lo reggnu.lo regenc.lo regsyntax.lo regtrav.lo  regversion.lo st.lo regposix.lo regposerr.lo unicode.lo  ascii.lo utf8.lo utf16_be.lo utf16_le.lo utf32_be.lo  utf32_le.lo euc_jp.lo sjis.lo iso8859_1.lo iso8859_2.lo  iso8859_3.lo iso8859_4.lo iso8859_5.lo iso8859_6.lo  iso8859_7.lo iso8859_8.lo iso8859_9.lo iso8859_10.lo  iso8859_11.lo iso8859_13.lo iso8859_14.lo iso8859_15.lo  iso8859_16.lo euc_tw.lo euc_kr.lo big5.lo gb18030.lo koi8_r.lo  cp1251.lo
libtool: link: gcc -shared  -fPIC -DPIC  .libs/regerror.o .libs/regparse.o .libs/regext.o .libs/regcomp.o .libs/regexec.o .libs/reggnu.o .libs/regenc.o .libs/regsyntax.o .libs/regtrav.o .libs/regversion.o .libs/st.o .libs/regposix.o .libs/regposerr.o .libs/unicode.o .libs/ascii.o .libs/utf8.o .libs/utf16_be.o .libs/utf16_le.o .libs/utf32_be.o .libs/utf32_le.o .libs/euc_jp.o .libs/sjis.o .libs/iso8859_1.o .libs/iso8859_2.o .libs/iso8859_3.o .libs/iso8859_4.o .libs/iso8859_5.o .libs/iso8859_6.o .libs/iso8859_7.o .libs/iso8859_8.o .libs/iso8859_9.o .libs/iso8859_10.o .libs/iso8859_11.o .libs/iso8859_13.o .libs/iso8859_14.o .libs/iso8859_15.o .libs/iso8859_16.o .libs/euc_tw.o .libs/euc_kr.o .libs/big5.o .libs/gb18030.o .libs/koi8_r.o .libs/cp1251.o      -Wl,-soname -Wl,libonig.so.2 -o .libs/libonig.so.2
libtool: link: (cd ".libs" && rm -f "libonig.so" && ln -s "libonig.so.2" "libonig.so")
libtool: link: (cd ".libs" && rm -f "libonig.so" && ln -s "libonig.so.2" "libonig.so")
libtool: link: ar cru .libs/libonig.a  regerror.o regparse.o regext.o regcomp.o regexec.o reggnu.o regenc.o regsyntax.o regtrav.o regversion.o st.o regposix.o regposerr.o unicode.o ascii.o utf8.o utf16_be.o utf16_le.o utf32_be.o utf32_le.o euc_jp.o sjis.o iso8859_1.o iso8859_2.o iso8859_3.o iso8859_4.o iso8859_5.o iso8859_6.o iso8859_7.o iso8859_8.o iso8859_9.o iso8859_10.o iso8859_11.o iso8859_13.o iso8859_14.o iso8859_15.o iso8859_16.o euc_tw.o euc_kr.o big5.o gb18030.o koi8_r.o cp1251.o
libtool: link: ranlib .libs/libonig.a
libtool: link: ( cd ".libs" && rm -f "libonig.la" && ln -s "../libonig.la" "libonig.la" )
sed                                           -e 's,[@]datadir[@],/usr/local/share,g'                  -e 's,[@]datarootdir[@],/usr/local/share,g'          -e 's,[@]PACKAGE_VERSION[@],5.9.6,g'  -e 's,[@]prefix[@],/usr/local,g'                    -e 's,[@]exec_prefix[@],/usr/local,g'          -e 's,[@]libdir[@],/usr/local/lib,g'                    -e 's,[@]includedir[@],/usr/local/include,g' <  > oniguruma.pc
/bin/sh: Syntax error: redirection unexpected (expecting word)
*** Error code 2

Stop.
make[2]: stopped in /usr/home/jamie/onig-5.9.6
*** Error code 1

Stop.
make[1]: stopped in /usr/home/jamie/onig-5.9.6
*** Error code 1

Stop.
make: stopped in /usr/home/jamie/onig-5.9.6

documentation on backreference nesting level

Hi,
first thank you very much @kkos for this wonderful library! โค๏ธ

As you may know, the PHP language uses Oniguruma (currently v5.9.6) as the regexp engine for it's mbstring extension.
The PHP documentation on Oniguruma is unfortunately close to non-existent, so I'm currently trying to contribute a small reference chapter about it's syntax and distinctive features.

However I'm having a hard time to understand the "backreference with nesting level" feature.
What I currently understand is that they allow referencing the result of a subexpression up or down the subexpression call stack. Is that right?

For example considering this simplified version of example 2 from the docs:

(?<element>
    < (?<name> [a-z]+ ) >
    (?> [^<]+ | \g<element> )*
    </ \k<name+0> >
)

It's pretty clear to me that we ask the engine to refer to the result of the <name> subexpr at the current nesting level instead of referring to it's last captured value.

But in the original version from the docs, you use \k<name+1>.
So you're asking for the result of <name> one level deeper than the current nesting level.
I don't understand why this works and why \k<name+0> doesn't.

Would you mind enlightening me on the subject?
That would help me greatly in documenting the feature!

Thanks again!

Segfault when matching long strings

Matching a string of length 160 causes a segfault in stack_double.

Minimal case to reproduce:

#include <string.h>
#include <oniguruma.h>

int main(int argc, char **argv) {
        const char *regstr = ".*";
        const char *inputstr =
                "aaaaaaaaaaaaaaaaaaaa"
                "aaaaaaaaaaaaaaaaaaaa"
                "aaaaaaaaaaaaaaaaaaaa"
                "aaaaaaaaaaaaaaaaaaaa"
                "aaaaaaaaaaaaaaaaaaaa"
                "aaaaaaaaaaaaaaaaaaaa"
                "aaaaaaaaaaaaaaaaaaaa"
                "aaaaaaaaaaaaaaaaaaaa"
                ;

        regex_t *reg;
        const UChar* start = (const UChar*)inputstr;
        const UChar* end = start + strlen(inputstr);
        int onigret = onig_new(&reg, (const UChar*)regstr, (const UChar*)regstr + strlen(regstr), ONIG_OPTION_CAPTURE_GROUP, ONIG_ENCODING_UTF8, ONIG_SYNTAX_PERL_NG, NULL);
        OnigRegion *region = onig_region_new();
        onig_search(reg, start, end, start, end, region, ONIG_OPTION_NONE);
        return 0;
}```

The beginning of the line symbol '^' doesn't work in lookahead

This regex (?<=(^|[[:space:]]))xxx fails: invalid pattern in look-behind(-122).
This regex (?<=^)xxx doesn't fail, but '^' only matches the beginning of input, not a newline inside of the input (ONIG_OPTION_MULTILINE is used).

For some reason, ^xxx also only matches the beginning of input, not newlines (ONIG_OPTION_MULTILINE is used).

Multiple memory leaks with 6.1.1 (and 5.9.6)

Onig has several leaks when an invalid pattern is passed and the parser returns an error.

Here are 3 patterns we use in our tests:

"[ab+"
"[ab]+"
"(\d:"

Each one causes a different parse error, respectively:

-103
-104
-117

And each one results in leaks as reported by GCC 5.4 and ASAN/LSAN (on Ubuntu 16.04 with all patches applied).

Here are 2 examples. I traced the first example by breaking on the only instance of ONIGERR_PREMATURE_END_OF_CHAR_CLASS in regparse.c and stepping out from there.

The root problem seems to be in parse_exp() where some return paths cause the 'qn' node to leak. I do not know the exact paths as I was stepping out from the error point.

In the 1st case below, the allocation of 'qn' occurs in the 'repeat' goto on line 5165 (frame 3). Then in the TK_CC_OPEN case handler in parse_exp() when parse_char_class() fails, parse_exp() immediately returns at that point leaking a live 'qn' allocation.

These leaks should be fixed as they could cause issues with a long running app that allows users to enter random patterns of their own (through UI or through a config file).

For our tests too, these leaks cause test harness failures with ASAN leaks active, so we have to disable ASAN leaks for our test harness.

`Direct leak of 56 byte(s) in 1 object(s) allocated from:
#0 0x7ffff6f02602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
#1 0x85c6ab in node_new /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:1072
#2 0x85cbb5 in node_new_quantifier /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:1252
#3 0x865c73 in parse_exp /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:5165
#4 0x865eb6 in parse_branch /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:5222
#5 0x866048 in parse_subexp /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:5259
#6 0x866204 in parse_regexp /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:5304
#7 0x866324 in onig_parse_make_tree /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:5335
#8 0x870d60 in onig_compile /home/src/core/modules/regex/vendor/onig/onig/src/regcomp.c:5279
#9 0x871448 in onig_new /home/src/core/modules/regex/vendor/onig/onig/src/regcomp.c:5518
...

Direct leak of 56 byte(s) in 1 object(s) allocated from:
#0 0x7ffff6f02602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
#1 0x85c6ab in node_new /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:1072
#2 0x85cc67 in node_new_enclose /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:1278
#3 0x85cd05 in node_new_enclose_memory /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:1301
#4 0x86491a in parse_enclose /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:4639
#5 0x8651ca in parse_exp /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:4899
#6 0x865eb6 in parse_branch /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:5222
#7 0x866048 in parse_subexp /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:5259
#8 0x866204 in parse_regexp /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:5304
#9 0x866324 in onig_parse_make_tree /home/src/core/modules/regex/vendor/onig/onig/src/regparse.c:5335
#10 0x870d60 in onig_compile /home/src/core/modules/regex/vendor/onig/onig/src/regcomp.c:5279
#11 0x871448 in onig_new /home/src/core/modules/regex/vendor/onig/onig/src/regcomp.c:5518
...`

Thread-safety: initialization and tear-down should be automatic and thread-safe

For initialization (onig_init()) use pthread_once() (POSIX) or InitOnceExecuteOnce() (WIN32).

For finalization (onig_end()) either do nothing (it's OK to leak some global state like Unicode tables) or use .fini section (ELF) or C++ destructors, or DllMain() (WIN32). This includes lazy initialization of the cclass table (which could have its own once-initializer), and the EUC/JIS Hiragana and Katakana property lists initialization (which could have its own once-initializer).

You can then leave onig_init() and onig_end() as empty stubs. For static-link archives it would help to have a --disable-thread option to not require -lpthread and to instead use a dumb once-initializer.

For USE_PARSE_TREE_NODE_RECYCLE use pthread_key_create() and pthread_setspecific() (POSIX) or thread-local storage and DllMain() (WIN32). Or just delete the USE_PARSE_TREE_NODE_RECYCLE code?

I think you can then drop all the THREAD_ATOMIC_START and THREAD_ATOMIC_END business and just declare that the remainder of thread safety is the application's job.

(I don't understand ONIG_STATE_DEC() though. Can you explain why it exists? reg->state doesn't look like a reference count...)

Out of bounds read in mbc_to_code()

For certain inputs to the regular expression parser via onig_new() an out of bounds read access will happen. This can be seen by compiling oniguruma with address sanitizer (-fsanitize=address). See code example below.

I found this bug while fuzzing PHP with american fuzzy lop, yet it seems the bug is not in PHP itself, but in it's bundled oniguruma copy. Tested both with the git code and version 5.9.5.

#include <oniguruma.h>
#include <string.h>
int main()
{
    regex_t *reg;
    unsigned char *inp = "0000\xfb";

    onig_new(&reg, inp, inp + strlen(inp), ONIG_OPTION_DEFAULT,
         ONIG_ENCODING_UTF8, ONIG_SYNTAX_DEFAULT, 0);
}

The Address Sanitizer stack trace:

==11030==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000044ad86 at pc 0x4433f3 bp 0x7fff0e435ba0 sp 0x7fff0e435b98
READ of size 1 at 0x00000044ad86 thread T0
    #0 0x4433f2 in mbc_to_code /mnt/ram/oniguruma/src/utf8.c:105
    #1 0x41063c in fetch_token /mnt/ram/oniguruma/src/regparse.c:3039
    #2 0x41d161 in parse_exp /mnt/ram/oniguruma/src/regparse.c:4905
    #3 0x41ebd7 in parse_branch /mnt/ram/oniguruma/src/regparse.c:5195
    #4 0x41efc6 in parse_subexp /mnt/ram/oniguruma/src/regparse.c:5232
    #5 0x41f375 in parse_regexp /mnt/ram/oniguruma/src/regparse.c:5277
    #6 0x41f7c1 in onig_parse_make_tree /mnt/ram/oniguruma/src/regparse.c:5304
    #7 0x43c593 in onig_compile /mnt/ram/oniguruma/src/regcomp.c:5263
    #8 0x43d54a in onig_new /mnt/ram/oniguruma/src/regcomp.c:5500
    #9 0x40113f in main /tmp/oniguruma-heap-oob-mbc_to_code.c:8
    #10 0x7f7379eea78f in __libc_start_main (/lib64/libc.so.6+0x2078f)
    #11 0x400f08 in _start (/tmp/a.out+0x400f08)

0x00000044ad86 is located 0 bytes to the right of global variable '*.LC1' from 'oniguruma-heap-oob-mbc_to_code.c' (0x44ad80) of size 6
SUMMARY: AddressSanitizer: global-buffer-overflow /mnt/ram/oniguruma/src/utf8.c:105 mbc_to_code
Shadow bytes around the buggy address:
  0x000080081560: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080081570: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080081580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080081590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800815a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0000800815b0:[06]f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x0000800815c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800815d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800815e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800815f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x000080081600: 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Contiguous container OOB:fc
  ASan internal:           fe
==11030==ABORTING

out of bounds heap read in add_bytes / regcomp.c

Passing a sequence of 19 bytes followed by 0xfd causes an out of bounds heap read. Tested against latest develop branch, found with libfuzzer+asan.

Test code:

#include <oniguruma.h>
int main()
{
    regex_t *reg;
    unsigned char inp[20] = {
'0','0','0','0','0','0','0','0',
'0','0','0','0','0','0','0','0',
'0','0','0',0xfd };

    onig_new(&reg, inp, inp + 20, ONIG_OPTION_DEFAULT,
         ONIG_ENCODING_UTF8, ONIG_SYNTAX_DEFAULT, 0);
}

Asan error:

==18957==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60600000eff8 at pc 0x0000004ab6d5 bp 0x7ffd83e081e0 sp 0x7ffd83e07990
READ of size 6 at 0x60600000eff8 thread T0
    #0 0x4ab6d4 in __asan_memcpy (/mnt/ram/oniguruma/a.out+0x4ab6d4)
    #1 0x582be1 in add_bytes /mnt/ram/oniguruma/src/regcomp.c:284:3
    #2 0x58263c in add_compile_string /mnt/ram/oniguruma/src/regcomp.c:452:3
    #3 0x5757ad in compile_string_node /mnt/ram/oniguruma/src/regcomp.c:541:10
    #4 0x54b3ca in compile_tree /mnt/ram/oniguruma/src/regcomp.c:1628:11
    #5 0x53f779 in onig_compile /mnt/ram/oniguruma/src/regcomp.c:5369:7
    #6 0x54e9c2 in onig_new /mnt/ram/oniguruma/src/regcomp.c:5518:7
    #7 0x4f21b9 in main (/mnt/ram/oniguruma/a.out+0x4f21b9)
    #8 0x7f346c7d378f in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r2/work/glibc-2.23/csu/../csu/libc-start.c:289
    #9 0x419708 in _start (/mnt/ram/oniguruma/a.out+0x419708)

0x60600000eff8 is located 0 bytes to the right of 56-byte region [0x60600000efc0,0x60600000eff8)
allocated by thread T0 here:
    #0 0x4c1628 in __interceptor_malloc (/mnt/ram/oniguruma/a.out+0x4c1628)
    #1 0x4f6afa in node_new /mnt/ram/oniguruma/src/regparse.c:1088:18
    #2 0x4f82ee in node_new_str /mnt/ram/oniguruma/src/regparse.c:1416:16
    #3 0x511e42 in parse_exp /mnt/ram/oniguruma/src/regparse.c:4927:13
    #4 0x5106fb in parse_branch /mnt/ram/oniguruma/src/regparse.c:5221:7
    #5 0x5072bd in parse_subexp /mnt/ram/oniguruma/src/regparse.c:5258:7
    #6 0x4faebf in parse_regexp /mnt/ram/oniguruma/src/regparse.c:5303:7
    #7 0x4fa704 in onig_parse_make_tree /mnt/ram/oniguruma/src/regparse.c:5339:7
    #8 0x53e4ef in onig_compile /mnt/ram/oniguruma/src/regcomp.c:5279:7
    #9 0x54e9c2 in onig_new /mnt/ram/oniguruma/src/regcomp.c:5518:7
    #10 0x4f21b9 in main (/mnt/ram/oniguruma/a.out+0x4f21b9)
    #11 0x7f346c7d378f in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.23-r2/work/glibc-2.23/csu/../csu/libc-start.c:289

SUMMARY: AddressSanitizer: heap-buffer-overflow (/mnt/ram/oniguruma/a.out+0x4ab6d4) in __asan_memcpy
Shadow bytes around the buggy address:
  0x0c0c7fff9da0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9db0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9dc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9dd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9de0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c0c7fff9df0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00[fa]
  0x0c0c7fff9e00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0c7fff9e40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==18957==ABORTING

Questions

Hi @kkos, I want to create a wiki page for this, just to make sure I get right. :)

  • Outside character classes, characters that must be escaped to be used literally are:

    ^$.?*+()[\|

  • In character classes, characters that must be escaped to be used literally are:

    ^-]\ (please correct me if I get it wrong)

    ^ and - can get away with this by "clever placement."

What really confuses me is [ and ] in character classes (though I think they both should be escaped for clarity):

  • In Ruby 2.3, [ must be escaped in classes and unescaped ] raises warnings but is allowed.
  • But in some other implementation (Atom) that use Oniguruma, [ can be used unescaped in classes but ] can't. ๐Ÿ˜‚
  • In POSIX, ] can be used literally by placing it at the start of a class. In Oniguruma, it can't, right?

Make error; SyntaxError: Unexpected token {

Trying to make Moonbase (see https://github.com/motif/moonbase) i stumbled upon a syntax error in Oniguruma

Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.2 LTS
Release:	16.04
Codename:	xenial

nodejs --version
v4.2.6

gulp --version
[22:14:20] CLI version 1.3.0

But got the following error; SyntaxError: Unexpected token {

   __  ___               __
  /  |/  /__  ___  ___  / /  ___ ____ ___
 / /|_/ / _ \/ _ \/ _ \/ _ \/ _ `(_-</ -_)
/_/  /_/\___/\___/_//_/_.__/\_,_/___/\__/

[22:08:15] Running watch for /home/tom/landing
/home/tom/landing/node_modules/oniguruma/src/oniguruma.js:1
(function (exports, require, module, __filename, __dirname) { const {OnigScanner, OnigString} = require('../build/Release/onig_scanner.node')
                                                                    ^

SyntaxError: Unexpected token {
    at exports.runInThisContext (vm.js:53:16)
    at Module._compile (module.js:374:25)
    at Object.Module._extensions..js (module.js:417:10)
    at Module.load (/home/tom/landing/node_modules/coffee-script/lib/coffee-script/register.js:45:36)
    at Function.Module._load (module.js:301:12)
    at Module.require (module.js:354:17)
    at require (internal/module.js:12:17)
    at Object.<anonymous> (/home/tom/landing/node_modules/first-mate/lib/grammar.js:10:16)
    at Object.<anonymous> (/home/tom/landing/node_modules/first-mate/lib/grammar.js:380:4)
    at Module._compile (module.js:410:26)
    at Object.Module._extensions..js (module.js:417:10)
    at Module.load (/home/tom/landing/node_modules/coffee-script/lib/coffee-script/register.js:45:36)
    at Function.Module._load (module.js:301:12)
    at Module.require (module.js:354:17)
    at require (internal/module.js:12:17)
    at Object.<anonymous> (/home/tom/landing/node_modules/first-mate/lib/grammar-registry.js:12:13)
    at Object.<anonymous> (/home/tom/landing/node_modules/first-mate/lib/grammar-registry.js:268:4)
    at Module._compile (module.js:410:26)
    at Object.Module._extensions..js (module.js:417:10)
    at Module.load (/home/tom/landing/node_modules/coffee-script/lib/coffee-script/register.js:45:36)
    at Function.Module._load (module.js:301:12)
    at Module.require (module.js:354:17)
    at require (internal/module.js:12:17)
    at Object.<anonymous> (/home/tom/landing/node_modules/first-mate/lib/first-mate.js:4:22)
    at Object.<anonymous> (/home/tom/landing/node_modules/first-mate/lib/first-mate.js:8:4)
    at Module._compile (module.js:410:26)
    at Object.Module._extensions..js (module.js:417:10)
    at Module.load (/home/tom/landing/node_modules/coffee-script/lib/coffee-script/register.js:45:36)
    at Function.Module._load (module.js:301:12)
    at Module.require (module.js:354:17)
    at require (internal/module.js:12:17)
    at Object.<anonymous> (/home/tom/landing/node_modules/highlights/lib/highlights.js:14:21)
    at Object.<anonymous> (/home/tom/landing/node_modules/highlights/lib/highlights.js:420:4)
    at Module._compile (module.js:410:26)
    at Object.Module._extensions..js (module.js:417:10)
    at Module.load (/home/tom/landing/node_modules/coffee-script/lib/coffee-script/register.js:45:36)
    at Function.Module._load (module.js:301:12)
    at Module.require (module.js:354:17)
    at require (internal/module.js:12:17)
    at Object.<anonymous> (/home/tom/landing/node_modules/moonbase/gulpfile.coffee:43:14)
    at Object.<anonymous> (/home/tom/landing/node_modules/moonbase/gulpfile.coffee:1:1)
    at Module._compile (module.js:410:26)
    at Object.loadFile (/home/tom/landing/node_modules/coffee-script/lib/coffee-script/register.js:16:19)
    at Module.load (/home/tom/landing/node_modules/coffee-script/lib/coffee-script/register.js:45:36)
    at Function.Module._load (module.js:301:12)
    at Module.require (module.js:354:17)
    at require (internal/module.js:12:17)
    at Object.<anonymous> (/home/tom/landing/node_modules/moonbase/run.coffee:46:12)
    at Object.<anonymous> (/home/tom/landing/node_modules/moonbase/run.coffee:3:1)
    at Module._compile (module.js:410:26)
    at Object.loadFile (/home/tom/landing/node_modules/coffee-script/lib/coffee-script/register.js:16:19)
    at Module.load (/home/tom/landing/node_modules/coffee-script/lib/coffee-script/register.js:45:36)
    at Function.Module._load (module.js:301:12)
    at Module.require (module.js:354:17)
    at require (internal/module.js:12:17)
    at Object.<anonymous> (/home/tom/landing/node_modules/moonbase/run.js:4:11)
    at Module._compile (module.js:410:26)
    at Object.Module._extensions..js (module.js:417:10)
    at Module.load (module.js:344:32)
    at Function.Module._load (module.js:301:12)
    at Function.Module.runMain (module.js:442:10)
    at startup (node.js:136:18)
    at node.js:966:3

Makefile:20: recipe for target 'watch' failed
make: *** [watch] Error 1

Do you think you can help me out? Would love to experiment with your project, thanks in advance!

Matching on streams

Hi,

This is a cross-post of k-takata/Onigmo#83 ; I'm not sure where is would make most sense to implement this feature, or whether it would be feasible at all.

How hard would it be to extend this library to support matching on arbitrary streams, instead of strings? I'm looking for an equivalent of TRE's tre_reguexec.

The basic idea is that the caller can pass the equivalent of an iterator over something that may not be a string. This would make it possible to match over gap buffers, ropes, piece tables, and other string implementations that do not rely on a contiguous char array.

Thanks!

Use after free in match_at()

The test code:

#include <stdio.h>
#include "oniguruma.h"
static int
scan_callback(int n, int r, OnigRegion* region, void* arg)
{
    return 0;
}

static int
scan(regex_t* reg, unsigned char* str, unsigned char* end)
{
  int r;
  OnigRegion *region;

  region = onig_region_new();

  r = onig_scan(reg, str, end, region, ONIG_OPTION_NONE, scan_callback, NULL);
  if (r >= 0) {
    fprintf(stdout, "total: %d match\n", r);
  }
  else { /* error */
    char s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str((OnigUChar* )s, r);
    fprintf(stderr, "ERROR: %s\n", s);
    return -1;
  }

  onig_region_free(region, 1 /* 1:free self, 0:free contents only */);
  return 0;
}

static int
exec(OnigEncoding enc, OnigOptionType options,
     char* apattern, char* astr,  int pattern_len,
    unsigned char *end, OnigSyntaxType* sytax)
{
    int r;
    regex_t* reg;
    OnigErrorInfo einfo;
    UChar* pattern = (UChar* )apattern;
    UChar* str     = (UChar* )astr;

    onig_initialize(&enc, 1);

    r = onig_new(&reg, pattern,
                 pattern + pattern_len,
                 options, enc, sytax , &einfo);
    if (r != ONIG_NORMAL) {
        char s[ONIG_MAX_ERROR_MESSAGE_LEN];
        onig_error_code_to_str(s, r, &einfo);
        fprintf(stderr, "ERROR: %s\n", s);
        return -1;
    }

    r = scan(reg, str, end);
    onig_free(reg);
    onig_end();
    return 0;
}

extern int main(int argc, char* argv[])
{
    int r;
    /* ISO 8859-1 test */
    static unsigned char str[] = { 0xc7, 0xd6, 0xfe, 0xea, 0xe0, 0xe2, 0x00 };
    char* pattern ="\x28\x7c\x28\x00\x28\x3f\x3a\x5c\x67\x27\x31\x27\x29\x2a\x7c\x7c\x28\x29\x29\x2a\x7c\x28\x28\x28\x28\x29\x29\x29\x5c\x6b\x27\x31\x2d\x30\x27\x29\x29\x30\x7c";
    r = exec(ONIG_ENCODING_EUC_JP,ONIG_OPTION_NONE,pattern,(char*) str,39, (char*) str + 7, ONIG_SYNTAX_DEFAULT);
    return r;
}

Asan error:

=================================================================
==10708==ERROR: AddressSanitizer: heap-use-after-free on address 0x62700001b9a0 at pc 0x00000044c760 bp 0x7ffcb93f6de0 sp 0x7ffcb93f6dd0
READ of size 8 at 0x62700001b9a0 thread T0
    #0 0x44c75f in match_at /home/xie/Downloads/oniguruma-develop/src/regexec.c:2437
    #1 0x457cb3 in onig_search /home/xie/Downloads/oniguruma-develop/src/regexec.c:3655
    #2 0x458905 in onig_scan /home/xie/Downloads/oniguruma-develop/src/regexec.c:3790
    #3 0x40117d in scan /home/xie/Downloads/oniguruma-develop/test/testu.c:17
    #4 0x4014e3 in exec /home/xie/Downloads/oniguruma-develop/test/testu.c:55
    #5 0x40162a in main /home/xie/Downloads/oniguruma-develop/test/testu.c:67
    #6 0x7f6af4b4082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #7 0x400fa8 in _start (/home/xie/Downloads/oniguruma-develop/test/testcu+0x400fa8)

0x62700001b9a0 is located 12448 bytes inside of 12912-byte region [0x627000018900,0x62700001bb70)
freed by thread T0 here:
    #0 0x7f6af4f81961 in realloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98961)
    #1 0x4412a6 in stack_double /home/xie/Downloads/oniguruma-develop/src/regexec.c:494
    #2 0x44958d in match_at /home/xie/Downloads/oniguruma-develop/src/regexec.c:2148
    #3 0x457cb3 in onig_search /home/xie/Downloads/oniguruma-develop/src/regexec.c:3655
    #4 0x458905 in onig_scan /home/xie/Downloads/oniguruma-develop/src/regexec.c:3790
    #5 0x40117d in scan /home/xie/Downloads/oniguruma-develop/test/testu.c:17
    #6 0x4014e3 in exec /home/xie/Downloads/oniguruma-develop/test/testu.c:55
    #7 0x40162a in main /home/xie/Downloads/oniguruma-develop/test/testu.c:67
    #8 0x7f6af4b4082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

previously allocated by thread T0 here:
    #0 0x7f6af4f81602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x4410fa in stack_double /home/xie/Downloads/oniguruma-develop/src/regexec.c:480
    #2 0x44c8d2 in match_at /home/xie/Downloads/oniguruma-develop/src/regexec.c:2450
    #3 0x457cb3 in onig_search /home/xie/Downloads/oniguruma-develop/src/regexec.c:3655
    #4 0x458905 in onig_scan /home/xie/Downloads/oniguruma-develop/src/regexec.c:3790
    #5 0x40117d in scan /home/xie/Downloads/oniguruma-develop/test/testu.c:17
    #6 0x4014e3 in exec /home/xie/Downloads/oniguruma-develop/test/testu.c:55
    #7 0x40162a in main /home/xie/Downloads/oniguruma-develop/test/testu.c:67
    #8 0x7f6af4b4082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

SUMMARY: AddressSanitizer: heap-use-after-free /home/xie/Downloads/oniguruma-develop/src/regexec.c:2437 match_at
Shadow bytes around the buggy address:
  0x0c4e7fffb6e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c4e7fffb6f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c4e7fffb700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c4e7fffb710: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c4e7fffb720: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c4e7fffb730: fd fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd
  0x0c4e7fffb740: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c4e7fffb750: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c4e7fffb760: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa fa
  0x0c4e7fffb770: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4e7fffb780: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
==10708==ABORTING

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.