Git Product home page Git Product logo

weggli's Introduction

weggli

weggli example

Introduction

weggli is a fast and robust semantic search tool for C and C++ codebases. It is designed to help security researchers identify interesting functionality in large codebases.

weggli performs pattern matching on Abstract Syntax Trees based on user provided queries. Its query language resembles C and C++ code, making it easy to turn interesting code patterns into queries.

weggli is inspired by great tools like Semgrep, Coccinelle, joern and CodeQL, but makes some different design decisions:

  • C++ support: weggli has first class support for modern C++ constructs, such as lambda expressions, range-based for loops and constexprs.

  • Minimal setup: weggli should work out-of-the box against most software you will encounter. weggli does not require the ability to build the software and can work with incomplete sources or missing dependencies.

  • Interactive: weggli is designed for interactive usage and fast query performance. Most of the time, a weggli query will be faster than a grep search. The goal is to enable an interactive workflow where quick switching between code review and query creation/improvement is possible.

  • Greedy: weggli's pattern matching is designed to find as many (useful) matches as possible for a specific query. While this increases the risk of false positives it simplifies query creation. For example, the query $x = 10; will match both assignment expressions (foo = 10;) and declarations (int bar = 10;).

Usage

Use -h for short descriptions and --help for more details.

 Homepage: https://github.com/weggli-rs/weggli

 USAGE: weggli [OPTIONS] <PATTERN> <PATH>

 ARGS:
     <PATTERN>
            A weggli search pattern. weggli's query language closely resembles
             C and C++ with a small number of extra features.

             For example, the pattern '{_ $buf[_]; memcpy($buf,_,_);}' will
             find all calls to memcpy that directly write into a stack buffer.

             Besides normal C and C++ constructs, weggli's query language
             supports the following features:

             _        Wildcard. Will match on any AST node.

             $var     Variables. Can be used to write queries that are independent
                      of identifiers. Variables match on identifiers, types,
                      field names or namespaces. The --unique option
                      optionally enforces that $x != $y != $z. The --regex option can
                      enforce that the variable has to match (or not match) a
                      regular expression.

             _(..)    Subexpressions. The _(..) wildcard matches on arbitrary
                      sub expressions. This can be helpful if you are looking for some
                      operation involving a variable, but don't know more about it.
                      For example, _(test) will match on expressions like test+10,
                      buf[test->size] or f(g(&test));

             not:     Negative sub queries. Only show results that do not match the
                      following sub query. For example, '{not: $fv==NULL; not: $fv!=NULL *$v;}'
                      would find pointer dereferences that are not preceded by a NULL check.

            strict:   Enable stricter matching. This turns off statement unwrapping 
                      and greedy function name matching. For example 'strict: func();' 
                      will not match on 'if (func() == 1)..' or 'a->func()' anymore.

             weggli automatically unwraps expression statements in the query source
             to search for the inner expression instead. This means that the query `{func($x);}`
             will match on `func(a);`, but also on `if (func(a)) {..}` or  `return func(a)`.
             Matching on `func(a)` will also match on `func(a,b,c)` or `func(z,a)`.
             Similarly, `void func($t $param)` will also match function definitions
             with multiple parameters.

             Additional patterns can be specified using the --pattern (-p) option. This makes
             it possible to search across functions or type definitions.

    <PATH>
            Input directory or file to search. By default, weggli will search inside
             .c and .h files for the default C mode or .cc, .cpp, .cxx, .h and .hpp files when
             executing in C++ mode (using the --cpp option).
             Alternative file endings can be specified using the --extensions (-e) option.

             When combining weggli with other tools or preprocessing steps,
             files can also be specified via STDIN by setting the directory to '-'
             and piping a list of filenames.


 OPTIONS:
     -A, --after <after>
            Lines to print after a match. Default = 5.

    -B, --before <before>
            Lines to print before a match. Default = 5.

    -C, --color
            Force enable color output.

    -X, --cpp
            Enable C++ mode.

        --exclude <exclude>...
            Exclude files that match the given regex.

    -e, --extensions <extensions>...
            File extensions to include in the search.

    -f, --force
            Force a search even if the queries contains syntax errors.

    -h, --help
            Prints help information.

        --include <include>...
            Only search files that match the given regex.

    -l, --limit
            Only show the first match in each function.

    -p, --pattern <p>...
            Specify additional search patterns.

    -R, --regex <regex>...
            Filter variable matches based on a regular expression.
             This feature uses the Rust regex crate, so most Perl-style
             regular expression features are supported.
             (see https://docs.rs/regex/1.5.4/regex/#syntax)

             Examples:

             Find calls to functions starting with the string 'mem':
             weggli -R 'func=^mem' '$func(_);'

             Find memcpy calls where the last argument is NOT named 'size':
             weggli -R 's!=^size$' 'memcpy(_,_,$s);'

    -u, --unique
            Enforce uniqueness of variable matches.
             By default, two variables such as $a and $b can match on identical values.
             For example, the query '$x=malloc($a); memcpy($x, _, $b);' would
             match on both

             void *buf = malloc(size);
             memcpy(buf, src, size);

             and

             void *buf = malloc(some_constant);
             memcpy(buf, src, size);

             Using the unique flag would filter out the first match as $a==$b.

    -v, --verbose
            Sets the level of verbosity.

    -V, --version
            Prints version information.

Examples

Calls to memcpy that write into a stack-buffer:

weggli '{
    _ $buf[_];
    memcpy($buf,_,_);
}' ./target/src

Calls to foo that don't check the return value:

weggli '{
   strict: foo(_);
}' ./target/src

Potentially vulnerable snprintf() users:

weggli '{
    $ret = snprintf($b,_,_);
    $b[$ret] = _;
}' ./target/src

Potentially uninitialized pointers:

weggli '{ _* $p;
NOT: $p = _;
$func(&$p);
}' ./target/src

Potentially insecure WeakPtr usage:

weggli --cpp '{
$x = _.GetWeakPtr(); 
DCHECK($x); 
$x->_;}' ./target/src

Debug only iterator validation:

weggli -X 'DCHECK(_!=_.end());' ./target/src

Functions that perform writes into a stack-buffer based on a function argument.

weggli '_ $fn(_ $limit) {
    _ $buf[_];
    for (_; $i<$limit; _) {
        $buf[$i]=_;
    }
}' ./target/src

Functions with the string decode in their name

weggli -R func=decode '_ $func(_) {_;}'

Encoding/Conversion functions

weggli '_ $func($t *$input, $t2 *$output) {
    for (_($i);_;_) {
        $input[$i]=_($output);
    }
}' ./target/src

Install

$ cargo install weggli

Build Instruction

# optional: install rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh 

git clone https://github.com/googleprojectzero/weggli.git
cd weggli; cargo build --release
./target/release/weggli

Implementation details

Weggli is built on top of the tree-sitter parsing library and its C and C++ grammars. Search queries are first parsed using an extended version of the corresponding grammar, and the resulting AST is transformed into a set of tree-sitter queries in builder.rs. The actual query matching is implemented in query.rs, which is a relatively small wrapper around tree-sitter's query engine to add weggli specific features.

Contributing

See CONTRIBUTING.md for details.

License

Apache 2.0; see LICENSE for details.

Disclaimer

This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.

weggli's People

Contributors

bluec0re avatar carstein avatar disconnect3d avatar fabianfreyer avatar felixwilhelm avatar totph avatar woodruffw avatar zetatwo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

weggli's Issues

Matching constructor causes a crash

I have this code

int getX() { return 0; }

class Foo {
	

	Foo(int x, int y);

	int getX();
	int getY(){return m_y;};

	int m_x;
	int m_y;
};

int Foo::getX() const { return m_x; }

Foo::Foo(int x, int y)
	:m_x(x)
	,m_y(y)
	{}

but when I do this query

Foo::Foo(_){_;}

I get the following crash

thread '<unnamed>' panicked at 'begin <= end (166 <= 165) when slicing `
int getX() { return 0; }

class Foo {


        Foo(int x, int y);

        int getX();
        int getY(){return m_y;};

        int m_x;
        int m_y;
};

int Foo::getX() const { return m_x; }

Foo::Foo(int x, int y)
        :m_x(x)
        ,m_y(y)
        {}
`', /home/brad/.cargo/registry/src/github.com-1ecc6299db9ec823/weggli-0.2.3/src/result.rs:79:20
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

'not a valid query variable' when NOT is added

I am using weggli 0.2.2.

While typing weggli -R'f=alloc' -R'r=free' '{ $f($v); $r($v); }' ./, it works.
While typing weggli -R'f=alloc' -R'r=free' '{ $f($v); NOT: $r($v); }' ./, I got reponse saying 'r' is not a valid query variable

sizeof subexpression not matching

In the circumstance that a query containing a subexpression is surrounded by two adjacent expressions, it appears that any sizeof subexpression matches get dropped. For example, for the following code:

int test_sizeof() {
    int b;
    a = sizeof(a) + b;
}
int test_deref() {
    int b;
    void* a = *a + b;
}
int test_call(void* c) {
    int b;
    void* a = test_call(a) + b;
}

The following behavior is observed:

$ weggli '{ _(a) + _; }' test.c
./test.c:1
int test_sizeof() {
    int b;
    a = sizeof(a) + b;
}
./test.c:5
int test_deref() {
    int b;
    void* a = *a + b;
}
./test.c:9
int test_call(void* c) {
    int b;
    void* a = test_call(a) + b;
}
$ weggli '{ _ = _(a); }' test.c
./test.c:1
int test_sizeof() {
    int b;
    a = sizeof(a) + b;
}
./test.c:5
int test_deref() {
    int b;
    void* a = *a + b;
}
./test.c:9
int test_call(void* c) {
    int b;
    void* a = test_call(a) + b;
}
$ weggli '{ _ = _(a) + _; }' test.c
./test.c:5
int test_deref() {
    int b;
    void* a = *a + b;
}
./test.c:9
int test_call(void* c) {
    int b;
    void* a = test_call(a) + b;
}

It's this final query that appears to return incorrect results. I would have expected the sizeof function to have been included in the query results, but it is not. It is only included if there are less than 2 adjacent expressions in the query itself.

Add flag to warn about parse errors / potentially missed matches

While tree-sitter's error recovery is pretty good for most targets, macro heavy code can lead to parsing errors and missed results. We should add an optional flag to surface statistics about the amount of parse errors.

Checking if any query identifiers occur inside the error-range is probably a good heuristic to warn about potentially missed matches.

Normalize const in method params

Is it possible that when we match

void foo(const int & $p)

it also matches

void foo(int const & $p)

The AST sees these as different but semantically they are the same. If programmers use different styles across the code base then matching can become fiddly as you need a different version for each possible combination.

can't search for-statement which use class member variable as threshold

I tried to use weggli on hexray C output and found some useful cases recently.

But I encounter a problem today and I think I need some help.

the problem is, I tried to find some function like below:

struct StructName *__fastcall ClassName::MethodName(ClassName *this, int a2)
{
  __int64 i; // r8
  struct StructName *result; // rax

  for ( i = 0i64; i < this->dword_80; ++i )
  {
    result = *(this->qword_88 + 8 * i);
    if ( a2 == *(result + 112) )
      return result;
  }
..
}

I could found above function with query below:

./target/release/weggli --cpp 'for(_; _ < this->dword_80; _) { _; }' ~/hexray_output.cpp

But, above query is using fixed member variable name so I couldn't exapand it to find another variant.

I tried following queries but none of them working as I want.

Queries with no-output

./target/release/weggli --cpp '_ $func(_* $thisptr) { for(_; _ < $thisptr->_; _) { _; } }' ~/hexray_output.cpp
./target/release/weggli --cpp '_ $func(_* $thisptr) { for(_; _ < $thisptr->dword_80; _) { _; } }' ~/hexray_output.cpp

./target/release/weggli -R 'thisptr=this|a1' --cpp '_ $func(_) { for(_; _ < $thisptr->dword_80; _) { _; } }' ~/hexray_output.cpp

./target/release/weggli --cpp '_ $func(_* $thisptr) { for(_; _ < _($thisptr)->_; _) { _; } }' ~/hexray_output.cpp

Queries with output

./target/release/weggli --cpp '_ $func(_* $thisptr) { for(_; _ < _($thisptr); _) { _; } }' ~/hexray_output.cpp
  • Output example:
__int64 __fastcall Func1(..., struct Struct1 *a2, ...)
{
  int v4; // er14
  int v7; // eax
  int v8; // ebx
  int v9; // ebx

// ...
    if ( v8 >= 0 )
    {
      if ( *(*(this + 932) + 112i64) )
      {
        for ( i = 0; i < *(a2 + 10); ++i ) // <- here
        {
         // ...

__int64 __fastcall Func2(
        struct Struct2 *a1,
        ....)
{
  __int64 v4; // rdi
...

  v4 = *(a1 + 5);
  v6 = a3;
  v7 = a2;
  updated = 0;
  for ( i = 0; i < *(*(a1 + 3) + 6i64); ++i ) // <- here
  {
    ...
}

only found i < (a1 + ...) , no output like i < a1->dword_80

./target/release/weggli -R 'thisptr=this|a1' --cpp '_ $func(_) { for(_; _ < _($thisptr); _) { _; } }' ~/hexray_output.cpp

This one found i < a1->dword_80 and similar member variable reference. But as you can see, it just use regex match instead of using this argument of class method. I want to find data flow from this argument to for-loop.

If you have any free time, I will really appreciate to let me know what I am doing wrong.


FYI, my host OS is macOS BigSur.

> sw_vers
ProductName:	macOS
ProductVersion:	11.4
BuildVersion:	20F71

can not find problem

file content as following

namespace content {
class CONTENT_EXPORT StoragePartitionImpl
    : public StoragePartition,
      public blink::mojom::DomStorage,
      public network::mojom::NetworkContextClient,
      public network::mojom::URLLoaderNetworkServiceObserver {
 public:
 private:  
};
}  // namespace content

following is result

./weggli 'class _{}' C://Users//Administrator//Desktop//test
show noting


# ./weggli 'class CONTENT_EXPORT $a:_{};' C://Users//Administrator//Desktop//test
Error! Query parsing failed: class CONTENT_EXPORT [MISSING ; ]  $a:_{};
# ./weggli 'class CONTENT_EXPORT $a:_{}' C://Users//Administrator//Desktop//test
Error! Query parsing failed: class CONTENT_EXPORT [MISSING ; ]  $a:_{}

# ./weggli 'class $a' C://Users//Administrator//Desktop//test

image

dear bro , is there something wrong?
the content is selected from chromium code.

Inner blocks

Hi @felixwilhelm,

One of my colleagues discovered the following issue. When you uncomment the inner function blocks, you will observe an abnormal increasing number of matches:

  • 1 bloc = 1 match
  • 2 blocs = 657 matches
  • 3 blocs = 3404 matches
  • 4 blocs = 10446 matches
% cat test.pat
{
  _* $p = _;
  free($p);
  _;
  free($p);
}
% cat test.c
void double_free_multiple_blocks()
{
    {
	char *s = malloc(10);
	free(s);
	for (int i = 0; i < 10; i++) {
	    // do stuff
	}
	free(s);
    }
    {
	char *s = malloc(10);
	free(s);
	for (int i = 0; i < 10; i++) {
	    // do stuff
	}
	free(s);
    }
    //{
    //	char *s = malloc(10);
    //	free(s);
    //	for (int i = 0; i < 10; i++) {
    //	    // do stuff
    //	}
    //	free(s);
    //}
    //{
    //	char *s = malloc(10);
    //	free(s);
    //	for (int i = 0; i < 10; i++) {
    //	    // do stuff
    //	}
    //	free(s);
    //}
}
% weggli "$(<test.pat)" test.c | grep test.c | wc -l
   657

Grammar source

In implementation details it says that

Search queries are first parsed using an extended version of the corresponding grammar

However, third_party/grammars/{c,cpp}/* only contains the tree-sitter-generated C and C++ code. Can the modified tree-sitter js grammar file be added to the repository?

compile error

= note: ld.lld: error: undefined symbol: std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::_M_mutate(unsigned long long, unsigned long long, wchar_t const*, unsigned long long)
>>> referenced by libtree-sitter-cpp-scanner.a(scanner.o):(.text$tree_sitter_cpp_external_scanner_scan)
x86_64-w64-mingw32-clang: error: linker command failed with exit code 1 (use -v to see invocation)

error: could not compile weggli due to previous error

Query returns the same result multiple times

I am running the following query:

weggli -X '{
  if (_ && _($atomic++)) { _; }
}' ~/fuchsia/zircon

And I am seeting the same result repeated multiple times.

You can try this by yourself running it against Fuchsia's source tree: https://cs.opensource.google/fuchsia/fuchsia

Attached screenshot of how it looks:

image

Is this expected behavior? It feels like maybe the last _ is matching different nodes in the AST?

Tree sitter query generation failed: Searching for multiple comparators fails

I got an error where tree sitter query generation failed with weggli 0.2.4, and the cli kindly informed me this was a bug.

Let me explain what I wanted to accomplish, what my query was, and the output.

What I Wanted to Find, For Extra Context

I want to find anywhere an enum value whose name ends with '_COUNT' is compared against some variable in order to find places an attacker can supply a negative enum value and dodge a bounds check.

Here is some example vulnerable code:

enum OptionType {
    Option_A,
    Option_B,
    Option_COUNT
};

bool OPTIONS[Option_Count];
void set_option(enum OptionType option_type_attacker_controlled, bool set_to) {
    // If option_type_attacker_controlled is negative this check will pass leading to an oob write
    if (option_type_attacker_controlled >= Option_COUNT) { abort(); } 
    OPTIONS[option_type_attacker_controlled] = set_to;
} 

The Buggy Query

Here is a minimal reproduction of a weggli query I came up with:

% weggli --cpp -u -R '$counted=\w*_COUNT' -C '{($var >= $counted); OR: ($var > $counted); }' ./
Tree sitter query generation failed: Structure
                 (binary_expression left: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @1 operator: "<=" right: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @0)]) )((labeled_statement label:(statement_identifier) (parenthesized_expression [(binary_expression left: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @2 operator: ">" right: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @3)
                                                                                                                                                                                                                                                                                                ^
sexpr: ((parenthesized_expression [(binary_expression left: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @0 operator: ">=" right: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @1)
                (binary_expression left: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @1 operator: "<=" right: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @0)]) )((labeled_statement label:(statement_identifier) (parenthesized_expression [(binary_expression left: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @2 operator: ">" right: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @3)
                (binary_expression left: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @3 operator: "<" right: [(identifier) (field_expression) (field_identifier) (qualified_identifier) (this)] @2)])) )
This is a bug! Can't recover :/

The Actual Query

Here is the full query I wanted:
weggli --cpp -u -R '$counted=\w*_COUNT' -C '{($var >= $counted); OR: ($var > $counted); OR: ($var < $counted); OR: ($var <= $counted); }' ./

Add the ability to restrict matching to possible linear followups

It would be great to be able to tell weggli that all the predicates have to be following each other in a linear way.

For example:

if (a)
  A;
else
  B;
C
if (b)
  D;
else
  E;

Here, A can't follow B, but it can be followed by C and D or E.

This should make queries like free($a); free($a); more interesting.

Mismatch of generic variable name using "NOT" statement

Hello,

While trying to generate a pattern for matching dangling pointer bugs on a random code-base I run into an issue.
Explicitly writing the variable name ==> does result in a match.
Using a generic variable name ($data) ==> does not result in a match.

ie:
Needle: {free($data); not: $data = NULL;} - doesnt match ==> expected behavior : MATCH
Needle: {free(data); not: data = NULL;} - does match ==> expected behavior : MATCH

Test file to reproduce the issue (only needle changed):

#[test]
fn not_dangling_correct() {
    let needle = "{free(data); not: data = NULL;}";
    let source = r"
    int32_t random_func_name2(random_struct **handle_ptr)
{
    random_struct_data *data = NULL;
    random_struct *handle = NULL;

    if ((handle_ptr != NULL) && (*handle_ptr != NULL)) {
        handle = *handle_ptr;
        data = (random_struct_data*)handle->_data;

        if (data != NULL) {
            if (data->name != NULL) {
               //TODO
            }
            free(data); //bug is here
        }
        if (handle != NULL) {
            free(handle);
        }
        *handle_ptr = NULL;
    }

    return 0;
}";

    let matches = parse_and_match(needle, source);

    assert_eq!(matches, 1);
}



#[test]
fn not_dangling_wrong() {
    let needle = "{free($data); not: $data = NULL;}";
    let source = r"
    int32_t random_func_name2(random_struct **handle_ptr)
{
    random_struct_data *data = NULL;
    random_struct *handle = NULL;

    if ((handle_ptr != NULL) && (*handle_ptr != NULL)) {
        handle = *handle_ptr;
        data = (random_struct_data*)handle->_data;

        if (data != NULL) {
            if (data->name != NULL) {
               //TODO
            }
            free(data);
        }
        if (handle != NULL) {
            free(handle);
        }
        *handle_ptr = NULL;
    }

    return 0;
}";

    let matches = parse_and_match(needle, source);

    assert_eq!(matches, 1);
}

ANSI encoded copyright symbol in comment breaks search

I have some files where there is a copyright symbol in the comments at the top of the file. It seems weggli can't handle these and just stops the search for the file entirely.

Minimal example:

// ยฉ

void MyBuggyFunction( void* data )
{
	char buf[10];
	memcpy( buf, data, 20 );
}

Result:
image

Without the copyright symbol, the search works as expected:
image

Also should note that this symbol only breaks when it is represented with the ANSI encoding (ie. 0xA9 byte), when copying the code above it will probably be represented with the UTF-8 encoding, which does work correctly.
To make reproducing this easier, I've attached the file with the ANSI encoded copyright symbol.

test.zip

what is the vuln of "snprintf"?

i am confused about the query on doucment "README.md"

    $ret = snprintf($b,_,_);
    $b[$ret] = _;

this query represent what vulnerable code?

Feature request: Make ; optional for last statement

Would be nice if weggli could automatically add ; if missing/the query is invalid:

Current behavior

$ weggli `strcat()` .
Error! Query parsing failed: strcat()

Wanted behavior

$ weggli `strcat()` .
# same output as if strcat(); was used

Python bindings fails with "undefined symbol"

I seem to have some problems building the Python bindings on my Ubuntu 22.04 LTS setup. Running the Python test yields the output below. A similar error occurs if I do python3 setup.py install --user followed by python3 -c 'import weggli'. When another person tried it though it worked so it's not always. Might be something broken with my environment but posting it here for future reference and my debugging efforts.

$ python3 setup.py test          
running test
WARNING: Testing via this command is deprecated and will be removed in a future version. Users looking for a generic test entry point independent of test runner are encouraged to use tox.
running egg_info
writing src/weggli.egg-info/PKG-INFO
writing dependency_links to src/weggli.egg-info/dependency_links.txt
writing top-level names to src/weggli.egg-info/top_level.txt
reading manifest file 'src/weggli.egg-info/SOURCES.txt'
adding license file 'LICENSE'
writing manifest file 'src/weggli.egg-info/SOURCES.txt'
running build_ext
running build_rust
cargo rustc --lib --message-format=json-render-diagnostics --manifest-path Cargo.toml -v --features python pyo3/extension-module -- --crate-type cdylib
       Fresh autocfg v1.0.1
       Fresh unicode-xid v0.2.2
       Fresh cfg-if v1.0.0
   Compiling syn v1.0.85
   Compiling proc-macro-hack v0.5.19
     Running `rustc --crate-name build_script_build --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/syn-1.0.85/build.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C embed-bitcode=no -C debuginfo=2 --cfg 'feature="clone-impls"' --cfg 'feature="default"' --cfg 'feature="derive"' --cfg 'feature="extra-traits"' --cfg 'feature="full"' --cfg 'feature="parsing"' --cfg 'feature="printing"' --cfg 'feature="proc-macro"' --cfg 'feature="quote"' --cfg 'feature="visit"' -C metadata=5737202938caee8b -C extra-filename=-5737202938caee8b --out-dir /home/zetatwo/Projects/weggli/target/debug/build/syn-5737202938caee8b -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --cap-lints allow`
     Running `rustc --crate-name build_script_build --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/proc-macro-hack-0.5.19/build.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C embed-bitcode=no -C debuginfo=2 -C metadata=f7ff84a8fe3d13fb -C extra-filename=-f7ff84a8fe3d13fb --out-dir /home/zetatwo/Projects/weggli/target/debug/build/proc-macro-hack-f7ff84a8fe3d13fb -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --cap-lints allow`
       Fresh scopeguard v1.1.0
       Fresh lazy_static v1.4.0
       Fresh cc v1.0.72
   Compiling parking_lot_core v0.8.5
     Running `rustc --crate-name build_script_build --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/parking_lot_core-0.8.5/build.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C embed-bitcode=no -C debuginfo=2 -C metadata=f85214fee3e3ca60 -C extra-filename=-f85214fee3e3ca60 --out-dir /home/zetatwo/Projects/weggli/target/debug/build/parking_lot_core-f85214fee3e3ca60 -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --cap-lints allow`
   Compiling smallvec v1.7.0
     Running `rustc --crate-name smallvec --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.7.0/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 -C metadata=9c17253beca91beb -C extra-filename=-9c17253beca91beb --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --cap-lints allow`
   Compiling unindent v0.1.7
     Running `rustc --crate-name unindent --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/unindent-0.1.7/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 -C metadata=95e9640e124853e7 -C extra-filename=-95e9640e124853e7 --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --cap-lints allow`
   Compiling inventory v0.1.11
     Running `rustc --crate-name build_script_build --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/inventory-0.1.11/build.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C embed-bitcode=no -C debuginfo=2 -C metadata=4c250073ccd131b2 -C extra-filename=-4c250073ccd131b2 --out-dir /home/zetatwo/Projects/weggli/target/debug/build/inventory-4c250073ccd131b2 -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --cap-lints allow`
   Compiling pyo3 v0.13.2
     Running `rustc --crate-name build_script_build --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/build.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C embed-bitcode=no -C debuginfo=2 --cfg 'feature="auto-initialize"' --cfg 'feature="ctor"' --cfg 'feature="default"' --cfg 'feature="extension-module"' --cfg 'feature="indoc"' --cfg 'feature="inventory"' --cfg 'feature="macros"' --cfg 'feature="paste"' --cfg 'feature="pyo3-macros"' --cfg 'feature="unindent"' -C metadata=cb8ee8d4ec596434 -C extra-filename=-cb8ee8d4ec596434 --out-dir /home/zetatwo/Projects/weggli/target/debug/build/pyo3-cb8ee8d4ec596434 -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --cap-lints allow`
       Fresh unicode-width v0.1.9
       Fresh bitflags v1.3.2
       Fresh regex-syntax v0.6.25
       Fresh either v1.6.1
       Fresh vec_map v0.8.2
       Fresh cfg-if v0.1.10
       Fresh same-file v1.0.6
       Fresh ansi_term v0.12.1
       Fresh void v1.0.2
       Fresh termcolor v1.1.2
       Fresh strsim v0.8.0
       Fresh rustc-hash v1.1.0
   Compiling instant v0.1.12
     Running `rustc --crate-name instant --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/instant-0.1.12/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 -C metadata=dc0183b714626b1c -C extra-filename=-dc0183b714626b1c --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern cfg_if=/home/zetatwo/Projects/weggli/target/debug/deps/libcfg_if-fa0d38a03582caa4.rmeta --cap-lints allow`
     Running `/home/zetatwo/Projects/weggli/target/debug/build/syn-5737202938caee8b/build-script-build`
   Compiling lock_api v0.4.5
     Running `rustc --crate-name lock_api --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/lock_api-0.4.5/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 -C metadata=34b0c67f331fd2ed -C extra-filename=-34b0c67f331fd2ed --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern scopeguard=/home/zetatwo/Projects/weggli/target/debug/deps/libscopeguard-c988ee37a88938c5.rmeta --cap-lints allow`
   Compiling weggli v0.2.4 (/home/zetatwo/Projects/weggli)
     Running `rustc --crate-name build_script_build --edition=2018 build.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C embed-bitcode=no -C debuginfo=2 --cfg 'feature="pyo3"' --cfg 'feature="python"' -C metadata=e9ac922e4610f015 -C extra-filename=-e9ac922e4610f015 --out-dir /home/zetatwo/Projects/weggli/target/debug/build/weggli-e9ac922e4610f015 -C incremental=/home/zetatwo/Projects/weggli/target/debug/incremental -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern cc=/home/zetatwo/Projects/weggli/target/debug/deps/libcc-619d115c76210738.rlib`
     Running `/home/zetatwo/Projects/weggli/target/debug/build/proc-macro-hack-f7ff84a8fe3d13fb/build-script-build`
     Running `/home/zetatwo/Projects/weggli/target/debug/build/parking_lot_core-f85214fee3e3ca60/build-script-build`
     Running `/home/zetatwo/Projects/weggli/target/debug/build/inventory-4c250073ccd131b2/build-script-build`
       Fresh textwrap v0.11.0
       Fresh walkdir v2.3.2
       Fresh libc v0.2.112
       Fresh proc-macro2 v1.0.36
       Fresh crossbeam-utils v0.8.6
       Fresh memchr v2.4.1
     Running `/home/zetatwo/Projects/weggli/target/debug/build/weggli-e9ac922e4610f015/build-script-build`
     Running `rustc --crate-name proc_macro_hack --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/proc-macro-hack-0.5.19/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C embed-bitcode=no -C debuginfo=2 -C metadata=36581dfd343e8acc -C extra-filename=-36581dfd343e8acc --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern proc_macro --cap-lints allow`
       Fresh log v0.4.14
       Fresh memoffset v0.6.5
       Fresh num-traits v0.2.14
       Fresh num_cpus v1.13.1
       Fresh atty v0.2.14
       Fresh time v0.1.44
     Running `rustc --crate-name parking_lot_core --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/parking_lot_core-0.8.5/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 -C metadata=54209c3f614f7cef -C extra-filename=-54209c3f614f7cef --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern cfg_if=/home/zetatwo/Projects/weggli/target/debug/deps/libcfg_if-fa0d38a03582caa4.rmeta --extern instant=/home/zetatwo/Projects/weggli/target/debug/deps/libinstant-dc0183b714626b1c.rmeta --extern libc=/home/zetatwo/Projects/weggli/target/debug/deps/liblibc-07f81d1987f627ca.rmeta --extern smallvec=/home/zetatwo/Projects/weggli/target/debug/deps/libsmallvec-9c17253beca91beb.rmeta --cap-lints allow`
       Fresh nix v0.17.0
       Fresh quote v1.0.14
       Fresh crossbeam-channel v0.5.2
       Fresh aho-corasick v0.7.18
     Running `/home/zetatwo/Projects/weggli/target/debug/build/pyo3-cb8ee8d4ec596434/build-script-build`
   Compiling paste-impl v0.1.18
     Running `rustc --crate-name paste_impl --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/paste-impl-0.1.18/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C embed-bitcode=no -C debuginfo=2 -C metadata=175494bd18f9ac70 -C extra-filename=-175494bd18f9ac70 --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern proc_macro_hack=/home/zetatwo/Projects/weggli/target/debug/deps/libproc_macro_hack-36581dfd343e8acc.so --extern proc_macro --cap-lints allow`
       Fresh crossbeam-epoch v0.9.6
       Fresh num-integer v0.1.44
       Fresh clap v2.34.0
       Fresh colored v2.0.0
   Compiling parking_lot v0.11.2
     Running `rustc --crate-name parking_lot --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/parking_lot-0.11.2/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 --cfg 'feature="default"' -C metadata=0be689697a3bba97 -C extra-filename=-0be689697a3bba97 --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern instant=/home/zetatwo/Projects/weggli/target/debug/deps/libinstant-dc0183b714626b1c.rmeta --extern lock_api=/home/zetatwo/Projects/weggli/target/debug/deps/liblock_api-34b0c67f331fd2ed.rmeta --extern parking_lot_core=/home/zetatwo/Projects/weggli/target/debug/deps/libparking_lot_core-54209c3f614f7cef.rmeta --cap-lints allow`
     Running `rustc --crate-name syn --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/syn-1.0.85/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 --cfg 'feature="clone-impls"' --cfg 'feature="default"' --cfg 'feature="derive"' --cfg 'feature="extra-traits"' --cfg 'feature="full"' --cfg 'feature="parsing"' --cfg 'feature="printing"' --cfg 'feature="proc-macro"' --cfg 'feature="quote"' --cfg 'feature="visit"' -C metadata=13cf818070f8dc37 -C extra-filename=-13cf818070f8dc37 --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern proc_macro2=/home/zetatwo/Projects/weggli/target/debug/deps/libproc_macro2-389aadaa5c25aeea.rmeta --extern quote=/home/zetatwo/Projects/weggli/target/debug/deps/libquote-7e54fa6ad85adb38.rmeta --extern unicode_xid=/home/zetatwo/Projects/weggli/target/debug/deps/libunicode_xid-3f4caffb2e0bd218.rmeta --cap-lints allow --cfg syn_disable_nightly_tests`
       Fresh regex v1.5.4
       Fresh crossbeam-deque v0.8.1
       Fresh chrono v0.4.19
   Compiling paste v0.1.18
     Running `rustc --crate-name paste --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/paste-0.1.18/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 -C metadata=1ff2b4e07c7be25e -C extra-filename=-1ff2b4e07c7be25e --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern paste_impl=/home/zetatwo/Projects/weggli/target/debug/deps/libpaste_impl-175494bd18f9ac70.so --extern proc_macro_hack=/home/zetatwo/Projects/weggli/target/debug/deps/libproc_macro_hack-36581dfd343e8acc.so --cap-lints allow`
       Fresh tree-sitter v0.20.2
       Fresh rayon-core v1.9.1
       Fresh simplelog v0.10.2
       Fresh rayon v1.5.1
   Compiling pyo3-macros-backend v0.13.2
     Running `rustc --crate-name pyo3_macros_backend --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-macros-backend-0.13.2/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 -C metadata=d9e312df010f9610 -C extra-filename=-d9e312df010f9610 --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern proc_macro2=/home/zetatwo/Projects/weggli/target/debug/deps/libproc_macro2-389aadaa5c25aeea.rmeta --extern quote=/home/zetatwo/Projects/weggli/target/debug/deps/libquote-7e54fa6ad85adb38.rmeta --extern syn=/home/zetatwo/Projects/weggli/target/debug/deps/libsyn-13cf818070f8dc37.rmeta --cap-lints allow`
   Compiling indoc-impl v0.3.6
   Compiling inventory-impl v0.1.11
     Running `rustc --crate-name indoc_impl --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/indoc-impl-0.3.6/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C embed-bitcode=no -C debuginfo=2 -C metadata=0826074e9c4ae085 -C extra-filename=-0826074e9c4ae085 --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern proc_macro_hack=/home/zetatwo/Projects/weggli/target/debug/deps/libproc_macro_hack-36581dfd343e8acc.so --extern proc_macro2=/home/zetatwo/Projects/weggli/target/debug/deps/libproc_macro2-389aadaa5c25aeea.rlib --extern quote=/home/zetatwo/Projects/weggli/target/debug/deps/libquote-7e54fa6ad85adb38.rlib --extern syn=/home/zetatwo/Projects/weggli/target/debug/deps/libsyn-13cf818070f8dc37.rlib --extern unindent=/home/zetatwo/Projects/weggli/target/debug/deps/libunindent-95e9640e124853e7.rlib --extern proc_macro --cap-lints allow`
     Running `rustc --crate-name inventory_impl --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/inventory-impl-0.1.11/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C embed-bitcode=no -C debuginfo=2 -C metadata=a59475fd41b92be4 -C extra-filename=-a59475fd41b92be4 --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern proc_macro2=/home/zetatwo/Projects/weggli/target/debug/deps/libproc_macro2-389aadaa5c25aeea.rlib --extern quote=/home/zetatwo/Projects/weggli/target/debug/deps/libquote-7e54fa6ad85adb38.rlib --extern syn=/home/zetatwo/Projects/weggli/target/debug/deps/libsyn-13cf818070f8dc37.rlib --extern proc_macro --cap-lints allow`
   Compiling ctor v0.1.21
     Running `rustc --crate-name ctor --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/ctor-0.1.21/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C embed-bitcode=no -C debuginfo=2 -C metadata=fe9715a5eb5b39de -C extra-filename=-fe9715a5eb5b39de --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern quote=/home/zetatwo/Projects/weggli/target/debug/deps/libquote-7e54fa6ad85adb38.rlib --extern syn=/home/zetatwo/Projects/weggli/target/debug/deps/libsyn-13cf818070f8dc37.rlib --extern proc_macro --cap-lints allow`
   Compiling ghost v0.1.2
     Running `rustc --crate-name ghost --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/ghost-0.1.2/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C embed-bitcode=no -C debuginfo=2 -C metadata=04ae0cc4d2fd97fc -C extra-filename=-04ae0cc4d2fd97fc --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern proc_macro2=/home/zetatwo/Projects/weggli/target/debug/deps/libproc_macro2-389aadaa5c25aeea.rlib --extern quote=/home/zetatwo/Projects/weggli/target/debug/deps/libquote-7e54fa6ad85adb38.rlib --extern syn=/home/zetatwo/Projects/weggli/target/debug/deps/libsyn-13cf818070f8dc37.rlib --extern proc_macro --cap-lints allow`
   Compiling pyo3-macros v0.13.2
     Running `rustc --crate-name pyo3_macros --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-macros-0.13.2/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type proc-macro --emit=dep-info,link -C prefer-dynamic -C embed-bitcode=no -C debuginfo=2 -C metadata=d3adc766305a7f5f -C extra-filename=-d3adc766305a7f5f --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern pyo3_macros_backend=/home/zetatwo/Projects/weggli/target/debug/deps/libpyo3_macros_backend-d9e312df010f9610.rlib --extern quote=/home/zetatwo/Projects/weggli/target/debug/deps/libquote-7e54fa6ad85adb38.rlib --extern syn=/home/zetatwo/Projects/weggli/target/debug/deps/libsyn-13cf818070f8dc37.rlib --extern proc_macro --cap-lints allow`
   Compiling indoc v0.3.6
     Running `rustc --crate-name indoc --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/indoc-0.3.6/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 -C metadata=170156cc1a7b9c76 -C extra-filename=-170156cc1a7b9c76 --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern indoc_impl=/home/zetatwo/Projects/weggli/target/debug/deps/libindoc_impl-0826074e9c4ae085.so --extern proc_macro_hack=/home/zetatwo/Projects/weggli/target/debug/deps/libproc_macro_hack-36581dfd343e8acc.so --cap-lints allow`
     Running `rustc --crate-name inventory --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/inventory-0.1.11/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 -C metadata=966b134369e7408f -C extra-filename=-966b134369e7408f --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern ctor=/home/zetatwo/Projects/weggli/target/debug/deps/libctor-fe9715a5eb5b39de.so --extern ghost=/home/zetatwo/Projects/weggli/target/debug/deps/libghost-04ae0cc4d2fd97fc.so --extern inventory_impl=/home/zetatwo/Projects/weggli/target/debug/deps/libinventory_impl-a59475fd41b92be4.so --cap-lints allow`
     Running `rustc --crate-name pyo3 --edition=2018 /home/zetatwo/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C embed-bitcode=no -C debuginfo=2 --cfg 'feature="auto-initialize"' --cfg 'feature="ctor"' --cfg 'feature="default"' --cfg 'feature="extension-module"' --cfg 'feature="indoc"' --cfg 'feature="inventory"' --cfg 'feature="macros"' --cfg 'feature="paste"' --cfg 'feature="pyo3-macros"' --cfg 'feature="unindent"' -C metadata=b62146b9735b1e6d -C extra-filename=-b62146b9735b1e6d --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern cfg_if=/home/zetatwo/Projects/weggli/target/debug/deps/libcfg_if-fa0d38a03582caa4.rmeta --extern ctor=/home/zetatwo/Projects/weggli/target/debug/deps/libctor-fe9715a5eb5b39de.so --extern indoc=/home/zetatwo/Projects/weggli/target/debug/deps/libindoc-170156cc1a7b9c76.rmeta --extern inventory=/home/zetatwo/Projects/weggli/target/debug/deps/libinventory-966b134369e7408f.rmeta --extern libc=/home/zetatwo/Projects/weggli/target/debug/deps/liblibc-07f81d1987f627ca.rmeta --extern parking_lot=/home/zetatwo/Projects/weggli/target/debug/deps/libparking_lot-0be689697a3bba97.rmeta --extern paste=/home/zetatwo/Projects/weggli/target/debug/deps/libpaste-1ff2b4e07c7be25e.rmeta --extern pyo3_macros=/home/zetatwo/Projects/weggli/target/debug/deps/libpyo3_macros-d3adc766305a7f5f.so --extern unindent=/home/zetatwo/Projects/weggli/target/debug/deps/libunindent-95e9640e124853e7.rmeta --cap-lints allow --cfg Py_SHARED --cfg Py_3_6 --cfg Py_3_7 --cfg Py_3_8 --cfg Py_3_9 --cfg Py_3_10 --cfg 'py_sys_config="WITH_THREAD"'`
     Running `rustc --crate-name weggli --edition=2018 src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type rlib --crate-type dylib --emit=dep-info,link -C embed-bitcode=no -C debuginfo=2 --crate-type cdylib --cfg 'feature="pyo3"' --cfg 'feature="python"' -C metadata=bac47e802d24988f --out-dir /home/zetatwo/Projects/weggli/target/debug/deps -C incremental=/home/zetatwo/Projects/weggli/target/debug/incremental -L dependency=/home/zetatwo/Projects/weggli/target/debug/deps --extern clap=/home/zetatwo/Projects/weggli/target/debug/deps/libclap-b3367e576e67dfe3.rlib --extern colored=/home/zetatwo/Projects/weggli/target/debug/deps/libcolored-28846ee9869253bc.rlib --extern log=/home/zetatwo/Projects/weggli/target/debug/deps/liblog-a402241ae309561f.rlib --extern nix=/home/zetatwo/Projects/weggli/target/debug/deps/libnix-2b057218735fb323.rlib --extern pyo3=/home/zetatwo/Projects/weggli/target/debug/deps/libpyo3-b62146b9735b1e6d.rlib --extern rayon=/home/zetatwo/Projects/weggli/target/debug/deps/librayon-4bd54a8f6a505307.rlib --extern regex=/home/zetatwo/Projects/weggli/target/debug/deps/libregex-4d86d90d817f153a.rlib --extern rustc_hash=/home/zetatwo/Projects/weggli/target/debug/deps/librustc_hash-062a375220cbcbba.rlib --extern simplelog=/home/zetatwo/Projects/weggli/target/debug/deps/libsimplelog-39cf1ecb22e7fbb2.rlib --extern tree_sitter=/home/zetatwo/Projects/weggli/target/debug/deps/libtree_sitter-c3e12a65db9ead5e.rlib --extern walkdir=/home/zetatwo/Projects/weggli/target/debug/deps/libwalkdir-78a2d3854b534f47.rlib -L native=/home/zetatwo/Projects/weggli/target/debug/build/weggli-2cc0bc4d0f1a0cf5/out -L native=/home/zetatwo/Projects/weggli/target/debug/build/weggli-2cc0bc4d0f1a0cf5/out -L native=/home/zetatwo/Projects/weggli/target/debug/build/weggli-2cc0bc4d0f1a0cf5/out -l static=tree-sitter-c -l static=tree-sitter-cpp-scanner -l stdc++ -l static=tree-sitter-cpp-parser -L native=/home/zetatwo/Projects/weggli/target/debug/build/tree-sitter-6cd8dc0e956750fc/out`
    Finished dev [unoptimized + debuginfo] target(s) in 19.17s
Copying rust artifact from /home/zetatwo/Projects/weggli/target/debug/libweggli.so to /home/zetatwo/Projects/weggli/src/weggli.cpython-310-x86_64-linux-gnu.so
tests (unittest.loader._FailedTest) ... ERROR

======================================================================
ERROR: tests (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: tests
Traceback (most recent call last):
  File "/usr/lib/python3.10/unittest/loader.py", line 470, in _find_test_path
    package = self._get_module_from_name(name)
  File "/usr/lib/python3.10/unittest/loader.py", line 377, in _get_module_from_name
    __import__(name)
  File "/home/zetatwo/Projects/weggli/tests/__init__.py", line 2, in <module>
    import weggli
ImportError: /home/zetatwo/Projects/weggli/src/weggli.cpython-310-x86_64-linux-gnu.so: undefined symbol: tree_sitter_cpp_external_scanner_create


----------------------------------------------------------------------
Ran 1 test in 0.000s

FAILED (errors=1)
Test failed: <unittest.runner.TextTestResult run=1 errors=1 failures=0>
error: Test failed: <unittest.runner.TextTestResult run=1 errors=1 failures=0>

Using regular expressions to match boolean operators.

I am having some trouble getting my query to work the way I intend to.

I want to match:

if (_ X _($atomic.fetch_sub())) {
}

where X is either && or ||, however, I cannot get the query right.

-l -R 'op=$\|\|^' -X '{
if (_ $op; _($atomic.fetch_sub())) { _; }
}'

Doesn't return anything, and

weggli/target/release/weggli -l -R 'op=$\|\|^' -X '{
if (_ $op _($atomic.fetch_sub())) { _; }
}'

Fails with:

Error! Query parsing failed: {
if (_ $op [MISSING ; ]  _($atomic.fetch_sub())) { _; }
}

I don't know if this is a bug or if I am not writing the query correctly.

Output does not show sink if used with wildcards

Description

When using variables with wildcards instead of actual function names, the location where a variable is used is not shown.

For example { $b = getenv(_); _(_, $b); } vs { $b = getenv(_); printf(_, $b); }

Expected

void displayHelp()
{
  char *c = getenv("HOME");
  int t;
  endwin();
  printf("%s\n",version);
  printf("Usage: dav [arguments] [FILENAME] [FILENAME] [LNUM] ...\n");
  printf("  where FILENAMEs, if specified, are the names of the files you wish to load.\nThe + or -l arguments must be after the filename.\n");
..
  }
  printf("  Ctrl-C : Quit (won't ask for save)\n");
  printf("  Ctrl-K : Erase to end of line\n");
  printf("  Ctrl-U : Erase whole line\n");
  printf("Personal options:\n");
  printf("  Located in %s/.davrc\n",c);
  printf("  Edit %s/.davrc to customize function key bindings\n",c);
  initscr();
  quit("");
}

Actual

void displayHelp()
{
  char *c = getenv("HOME");
  int t;
  endwin();
  printf("%s\n",version);
  printf("Usage: dav [arguments] [FILENAME] [FILENAME] [LNUM] ...\n");
  printf("  where FILENAMEs, if specified, are the names of the files you wish to load.\nThe + or -l arguments must be after the filename.\n");
..
}

Templated calls inconsistencies when querying

Hi!

First of all, thank you for such a fun tool to use - I've been enjoying it!

So I am not 100% sure what is expected, what isn't but it feels to me there is something going wrong. Here is my a.cc test case:

void foo(const int32_t a1) {
        const auto &a = a::b::c::d::my_call<std::string>(a1);
        auto b = my_call<std::string>(a1);
        auto c = my_call(a1);
        auto d = my_call<string>(a1);
}

And here are the queries with their associated results:

  • $ ./target/debug/weggli --cpp '$f()' ./test.cc matches a::b::c::d::my_call<std::string>(a1); and my_call(a1); but not the other two lines
  • $ ./target/debug/weggli --cpp '$f<$f2>()' ./test.cc matches my_call<string>(a1); but not the other lines which seems odd as well
  • $ ./target/debug/weggli --cpp '$f<std::string>()' ./test.cc matches my_call<std::string>(a1); but not a::b::c::d::my_call<std::string>(a1);

Maybe I am doing something wrong ๐Ÿ˜…? Let me know if you need anything more from me.

Cheers

Release 0.2.4 (?)

Hey,

It would be nice to make another release of Weggli since the latest release (installed via cargo install weggli) has e.g. this bug:

$ weggli -X 'some_type some_func(_) {_;}' ./
thread '<unnamed>' panicked at 'begin <= end (23 <= 22) when slicing `// some_type some_func
some_type some_func(some arg) {some_body();}
`', /home/dc/.cargo/registry/src/github.com-1ecc6299db9ec823/weggli-0.2.3/src/result.rs:79:20
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

$ cat bar.cpp
// some_type some_func
some_type some_func(some arg) {some_body();}

While the latest/compiled from sources does not have it:

$ /home/dc/tools/weggli/target/release/weggli -X 'some_type some_func(_) {_;}' ./
/home/dc/playground/wegglicrash/./bar.cpp:2
some_type some_func(some arg) {some_body();}

Failure generating query without { }

Hi. This query generates an error with tree sitter: weggli '{ _ $x; for (_;_;_) $x = _; }' .

Tree sitter query generation failed: Structure
                         (init_declarator declarator:(pointer_declarator declarator: [(identifier) (field_expression) (field_identifier) (scoped_identifier)] @2) value: [(cast_expression value: (_)) (_)])]) )
                        ^
sexpr: ((declaration type:(_) declarator:[(identifier) (field_expression) (field_identifier) (scoped_identifier)] @0) )((for_statement "for" @1 initializer:(_) condition:(_) update:(_) [(assignment_expression left: [(identifier) (field_expression) (field_identifier) (scoped_identifier)] @2 right: [(cast_expression value: (_)) (_)])
                        (init_declarator declarator: [(identifier) (field_expression) (field_identifier) (scoped_identifier)] @2 value: [(cast_expression value: (_)) (_)])
                        (init_declarator declarator:(pointer_declarator declarator: [(identifier) (field_expression) (field_identifier) (scoped_identifier)] @2) value: [(cast_expression value: (_)) (_)])]) )
This is a bug! Can't recover :/

With {} it can generate the query. weggli '{ _ $x; for (_;_;_) {$x = _;} }' .

This is not recognized,

void main() {
        int i;
        for ( i;i;i ) i = 0;
}

while this one is recognized

void main() {
        int i;
        for ( i;i;i ) {i = 0;}
}

Is the problem with the way tree sitter is used here? If it's with them, I'll forward this issue to them

How do I search for all else {} blocks that contain a return or a throw?

First of all, thank you for sharing such a useful tool!

I am trying to find patterns in the code like this:

  if (condition)
  {
      /* do */
      /* many */
      /* things */
  }
  else
  {
      throw PreconditionNotSatisified{};   // or return -ERROR, 
  }

so they can be refactored to return early and reduce code indentation.

I have tried:

  weggli 'if (_) else { throw }' .

  weggli 'if (_) { _ } else { throw $something; }' .

but nothing seems to work. Help!

Thank you.

Crash when parsing query

I am not able to reproduce this, but I got this with something vaguely like $var = $func($arg1, . I have also noticed issues where $v = $f() doesn't work, $v = $fu() doesn't work, but $v = $fun() does work, and $v = $f($p) works. It's very strange.

My use case is that I an interactively trying to build a Weggli query on each keystroke.

Tree sitter query generation failed: Structure
 ([(assignment_expression left: [(identifier) (field_expression) (field_identifier)] @0 right: [(cast_expression value: (call_expression function:[(identifier) (field_expression) (field_identifier)] @1 arguments:(argument_list . [(identifier) (field_expression) (field_identifier)] @2 . (ERROR)))) (call_expression function:[(identifier) (field_expression) (field_identifier)] @1 arguments:(argument_list . [(identifier) (field_expression) (field_identifier)] @2 . (ERROR)))])
                                                                                                                                                                                                                                                                                              ^
sexpr: ([(assignment_expression left: [(identifier) (field_expression) (field_identifier)] @0 right: [(cast_expression value: (call_expression function:[(identifier) (field_expression) (field_identifier)] @1 arguments:(argument_list . [(identifier) (field_expression) (field_identifier)] @2 . (ERROR)))) (call_expression function:[(identifier) (field_expression) (field_identifier)] @1 arguments:(argument_list . [(identifier) (field_expression) (field_identifier)] @2 . (ERROR)))])
                        (init_declarator declarator: [(identifier) (field_expression) (field_identifier)] @0 value: [(cast_expression value: (call_expression function:[(identifier) (field_expression) (field_identifier)] @1 arguments:(argument_list . [(identifier) (field_expression) (field_identifier)] @2 . (ERROR)))) (call_expression function:[(identifier) (field_expression) (field_identifier)] @1 arguments:(argument_list . [(identifier) (field_expression) (field_identifier)] @2 . (ERROR)))]) 
                        (init_declarator declarator:(pointer_declarator declarator: [(identifier) (field_expression) (field_identifier)] @0) value: [(cast_expression value: (call_expression function:[(identifier) (field_expression) (field_identifier)] @1 arguments:(argument_list . [(identifier) (field_expression) (field_identifier)] @2 . (ERROR)))) (call_expression function:[(identifier) (field_expression) (field_identifier)] @1 arguments:(argument_list . [(identifier) (field_expression) (field_identifier)] @2 . (ERROR)))])]@3)
This is a bug! Can't recover :/

Is it possible to only report when the last else {} contains a single statement only (either throw or return) ?

Is it possible to only report when the last else {} contains a single statement only (either throw or return) ?

So, I want to find these:

if (condition)
{
    /* do */
    /* many */
    /* things */
}
else
{
    throw PreconditionNotSatisified{};   // or return -ERROR
}

but not these:

if (condition)
{
    /* do */
    /* many */
    /* things */
}
else
{
    if (someOtherCondition)
    {
        // try something else
    }
    else
    {
         throw PreconditionNotSatisified{};   // or return -ERROR
    }
}

Originally posted by @0x8000-0000 in #17 (comment)

Question - query construction

Kicking the tires, so to speak, to see how it compares to the tools listed in the README. I skimmed through https://github.com/googleprojectzero/weggli/blob/main/tests/query.rs to see if there was a similar query that I could build off-of, but I didn't see it. So, I am wondering if I am encountering a current known limitation.

Context: I have a synthetic bug that I can't share without first asking for permission. The bug is a stack buffer overwrite. The stack variable is declared early in the call stack but the bug is triggered at least two stack frames down the call stack. The bug is an improper length parameter used in a snprintf call.

Based on the examples that I could dig up, here's what I am using:

# this doesn't return anything
'_ $fn(_, $buf, $limit, _) { snprintf($buf, _, _); }'

# returns too many things, and I really want to have the stack variable context
'_ $fn(_, $buf, $limit, _) { snprintf(_, _, _); }'

# trying with a single caller and a callee (but in reality I need to go more than one layer deep)
'{
_ $fn(_, $buf, $limit, _) { snprintf(_, _, _); }

{
_ $buf[_];
$fn(_. $buf, _);
}
}'

Question:

  1. Does weggli support querying across multiple functions or is the AST pattern matching limited (e.g., to a single source file or a single function)?
  2. Is my caller-callee query on the right track? If so, how do I extend the query to an arbitrary depth of different functions?

foo(_,_,0) mismatch

I'm searching for memset(_,_,0) and it matched this line from Boost:

std::memset((void*)boost::movelib::iterator_to_raw_pointer(r), 0, sizeof(value_type)*n);

Empty loop body matches non empty body

Hi. The query while (_) ; (as of 13a332f) matches both while (1) printf(""); and while (1) ;.
The goal here is to find loops with an empty body. Is there another query for it or is this an issue with weggli?

Can't query across files?

test code:

test1.cpp

#include "test2.h"
static INT64 func1(VOID)
{

    lResult = func2(1, func3, (VOID *)NULL);

    return SUCCESS;
}

test2.cpp

VOID *func3(VOID *pStr)
{
    int i = 0;
    (VOID)pthread_detach(pthread_self());

    return (VOID *)0;
} //lint !e715 !e818

my query not work:

weggli -C --cpp --unique 'func2(_,$fn,_);' -p '_ $fn(_){
_;
pthread_detach(pthread_self());
    }' test

When I put it together, the query works. So is it my query worng or weggli not support query across files?

A maxdepth option perhaps

I have a repo with loads of build repos underneath it that I also get hits for; it'd be nice to be able to filter those out either by a maxdepth option or providing like a do not descend into these names like thing (a la how git does it with pathspec).

invalid source/capture range when used with python API

Hello,

Thank you for this tool. I'm experiencing a different behaviour whether I'm using weggli on command line or through python API.

% cat test.pat
do {
  $buf[_+_]=_;
} while(_);
% cat test.c
void loop (char *b){
	int i = 0;
	do {
		b[i+i] = 0;
		i += 1;
	} while (i < 10);
}
% weggli "$(<test.pat)" test.c
[...]/test.c:1
void loop (char *b){
	int i = 0;
	do {
		b[i+i] = 0;
		i += 1;
	} while (i < 10);
}

Works fine, but with python API:

% cat test.py
import weggli
qry=open("test.pat").read()
src=open("test.c").read()
pq=weggli.parse_query(qry)
for m in weggli.matches(pq, src):
    weggli.display(m, src)
% python3 test.py
  for m in weggli.matches(pq, src):
thread '<unnamed>' panicked at 'begin <= end (34 <= 33)' when slicing 'void loop (char *b){'
[...]
, src/result.rs:79:20
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "[...]/test.py", line 6, in <module>
    weggli.display(m, src)
pyo3_runtime.PanicException: begin <= end (34 <= 33) when slicing 'void loop (char *b){'
[...]

Little work-around, but not an acceptable PR :)

diff --git a/src/result.rs b/src/result.rs
index 00997a4..77f0d8f 100644
--- a/src/result.rs
+++ b/src/result.rs
@@ -73,7 +73,10 @@ impl<'a, 'b> QueryResult {

         if self.captures.len() > 1 {
             // Ensure we don't overlap with the range of the next node.
-            header_end = cmp::min(header_end, self.captures[1].range.start - 1);
+            let next = self.captures[1].range.start - 1;
+            if next > self.function.start {
+                header_end = cmp::min(header_end, next);
+           }
         }

         result += &source[self.function.start..header_end];

Both C and C++

I would like to have a config file where I can specify command line options.
In particular I have a mixed code base here, both C and C++ and half a dozen of other languages.
I would like weggli to search in all of them.

By default it will look in C only. If I turn on C++ mode, it will look in C++ only and not in C.
So currently I'm using a ton of -e options at the end, which is pretty unwieldly and inelegant.

An easier solution would be to give me an --all option.

EDIT: I can work around for now by piping a list of filenames into weggli. So treat as low prio.

Finding function declarations

I would like to find function declarations only but it seems that weggli does not support this at the moment.

What I have tried:
$ weggli -R func=myFoo '_ $func(_);' .

If this is indeed not supported, an enhancement would be nice even though this may be a little out of scope for the tool (based on its description).

Incorrect cross-function match identified

Hello - when I run this query:

weggli -X '{
const _ $a = _;
std::move($a);
}' /tmp/x.cc

On this code:

MACRO

namespace {

void func() {
    const auto g = f2();
}

void func2() {
    auto g = f2();
    std::move(g);
}

}

I get this:

/tmp/x.cc:1
MACRO

namespace {

void func() {
    const auto g = f2();
}

void func2() {
    auto g = f2();
    std::move(g);
}

}

It seems weggli is unaware of the function boundary here. Both the macro and namespace are necessary for the bug.

Adding a ; after MACRO fixes it, but that's not something I can change in my codebase.

Thanks!

Unable to match class members

I have the following example

int getX() { return 0; }

class Foo {
	

	Foo(int x, int y)
	:m_x(x)
	,m_y(y)
	{}

	int getX();
	int getY(){return m_y;};

	int m_x;
	int m_y;
};

int Foo::getX() const { return m_x; }

I can match getX with

_ Foo::getX() const {_;}

but I can't create a query to match getY which is inline. Is this possible?

__declspec messes up matching

Query is

./target/release/weggli -f -A1 -B1 -X   'class Foo { Foo(const Foo &$x); }'

This matches

	class Foo
	{
		void save(int& archive);


#if false
		typedef __declspec(deprecated(".")) int value_type;
#endif
	public:
		Foo(const Foo & foo );
	};

but this does not

	class Foo
	{
		void save(int& archive);


	public:
#if false
		typedef __declspec(deprecated(".")) int value_type;
#endif
		Foo(const Foo & foo );
	};

Note that if you remove the declaration of save then it always matches.

:)

Feature Request: optional line numbers in the output

Current

main.c:123
void main(int argc, char **argv) {
   char buf[256]
...
   strcpy(buf, argv[0]);
...
}

Wanted

main.c:123
123: void main(int argc, char **argv) {
124:   char buf[256]
...
140:   strcpy(buf, argv[0]);
...
160: }

Handling of the >= operator is broken in specific cases

In certain cases, weggli seems to parse the >= operator incorrectly. While I haven't debugged the code to confirm, I suspect weggli is mistakenly parsing template parameter statements where they don't exist, thus swallowing the > and treating the = as a stand-alone assignment operator.

Consider the following query:

weggli --cpp '{
    if (_ = _)
    { }
}' $TARGET

This is meant to detect non-declaration assignments in if statements. This query matches the following snippets, as expected:

void test_bad_conditional() {
	if (x = 1234) {

	}
}
void test_bad_conditional() {
	if (x =! false) {
		
	}
}

And, as expected, it does not match this snippet:

void test_bad_conditional() {
	if (auto x = 1234) {

	}
}

Something I did not expect, however, is for it to match these snippets (it does):

bool test_bad_conditional() {
	if (a.index < 0 || b->index >= values.size()) {
		return false;
	}
	return true;
}
bool test_bad_conditional() {
	if (b.x < 3 && c >= 5) {
		return false;
	}
	return true;
}
bool test_bad_conditional() {
	if (head->packets < 1 || head->offset >= offset) {
		return false;
	}
	return true;
}

I've run this query against various large code-bases, and the only false matches contain a < preceding the >, hence my suspicion that weggli is mistakenly parsing a template param. This isn't the only prerequisite, however, as this snippet does not match:

bool test_bad_conditional() {
	if (b < 4 || a >= 3) {
		return false;
	}
	return true;
}

Making me think that, in addition to the preceding <, the presence of the . or -> (and potentially other) operators on either side is important.

Match the size of an array

Hey, thanks for weggli, it's so awesome ;)

I ran into an issue and wanted to see if you had a solution for it:

Given this pattern:
weggli -u '{char $buf[$len];snprintf($buf, $len2,_);}' test.c

I expect the following lines to be matched:

char buffer[80];
snprintf(buffer, 256, "aaaaaaaa%s", somevar);

By minimalizing the pattern, I found that a pattern like $buf[$a] won't match char buffer[80]; but $buf[_] will. Am I doing something wrong?

Feature request: cross function query and taint tracking

Hi,
This is amazing tool! I have been able to find dozens of valid bugs in our production code with weggli for code review. It is very effective for simple pattern query.
There is a major issue: weggli doesn't support cross function query and data tainting, which is very basic requirement for complex pattern search. For example:
func1(int v)
{
in b = v;
func2(b);
}

func2(int v)
{
memcpy(,,v);
}
in above code, if func1 is the attack surface, there is no way to track data flow, i.e., do data tainting, to track the value to the final memcpy. Simply query memcpy wont help since too many false positives. Not sure if tree-sitter supports this since data tracking usually needs to build the code, or is there any solution to partially support this?

Cannot find the switch, case, To<> pattern

Hi:
I'm using the weggli to find the code pattern, but it returns nothing. I wonder if you have any hints for me to solve this problem.
The pattern I want to find is :

switch(A) {
  case AA: {
       B = To<C>(D);
  }
}

An example from Chromiun is :

  switch (basic_shape->GetType()) {
    case BasicShape::kBasicShapeCircleType: {
      const BasicShapeCircle* circle = To<BasicShapeCircle>(basic_shape);
      ...
      break;
    }
...
}

The command I use is:

weggli '{
   switch(_) {
	$x=To<_>(_);
   }
}' /path/to/code/

I appreciate any help you can give.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.