Git Product home page Git Product logo

Comments (6)

clarkmcc avatar clarkmcc commented on July 30, 2024 1

Do you want to keep both parsers?

I'd like to benchmark both and then decide from there. If we need to support both we can cross that bridge. I like the idea of only supporting one parser if possible so it might be worth a performance audit of the chumsky parser if it's significantly different then LALRPOP.

from cel-rust.

hardbyte avatar hardbyte commented on July 30, 2024 1

I'll make another PR to remove the DoubleMinus and DoubleNot, and then I'm down to 2 failing tests with the chumsky parser. 🀞🏼

from cel-rust.

hardbyte avatar hardbyte commented on July 30, 2024

Yeah that sounds like a good idea. If it isn't too hard to process textproto files from rust we could try benchmarking/testing with the test data.

from cel-rust.

clarkmcc avatar clarkmcc commented on July 30, 2024

Good idea. I think we can use this function to parse those files. I'll need to do some digging to get the full schema of those types, but that shouldn't be difficult to find.

from cel-rust.

inikolaev avatar inikolaev commented on July 30, 2024

Somewhat related to testing topic mentioned here I tried to reuse Gerkin files from here to run some conformance tests and it works pretty well using the cucumber-rs crate.

I had to make some changes expected output in the feature files, but it more or less works.

There are some issues with recognising hex and unsigned integers, so I had to adjust the grammar to be able to parse them correctly:

-    r"-?[0-9]+" => Atom::Int(<>.parse().unwrap()),
-    r"-?0[xX]([0-9a-fA-F]+)" => Atom::Int(i64::from_str_radix(<>, 16).unwrap()),
-    r"-?[0-9]+ [uU]" => Atom::UInt(<>.parse().unwrap()),
-    r"-?0[xX]([0-9a-fA-F]+) [uU]" => Atom::UInt(u64::from_str_radix(<>, 16).unwrap()),
+    r"[-+]?[0-9]+" => Atom::Int(<>.parse().unwrap()),
+    r"([-+])?0[xX]([0-9a-fA-F]+)" => {
+        // We cannot extract capture groups, see https://github.com/lalrpop/lalrpop/issues/575
+        let re = Regex::new(r"([-+])?0[xX]([0-9a-fA-F]+)").unwrap();
+        let captures = re.captures(<>).unwrap();
+
+        let sign = match captures.get(1) {
+            Some(v) => v.as_str(),
+            _ => "",
+        };
+
+        Atom::Int(
+            i64::from_str_radix(
+                format!("{}{}", sign, captures.get(2).unwrap().as_str()).as_str(), 16
+            ).unwrap()
+        )
+    },
+    r"([0-9]+)[uU]" => {
+        // We cannot extract capture groups, see https://github.com/lalrpop/lalrpop/issues/575
+        let re = Regex::new(r"([0-9]+)[uU]").unwrap();
+        let captures = re.captures(<>).unwrap();
+
+        Atom::UInt(captures.get(1).unwrap().as_str().parse().unwrap())
+    },
+    r"0[xX]([0-9a-fA-F]+)[uU]" => {
+        // We cannot extract capture groups, see https://github.com/lalrpop/lalrpop/issues/575
+        let re = Regex::new(r"0[xX]([0-9a-fA-F]+)[uU]").unwrap();
+        let captures = re.captures(<>).unwrap();
+
+        Atom::UInt(u64::from_str_radix(captures.get(1).unwrap().as_str(), 16).unwrap())
+    },

But there are other issues as well related to parsing unicode codes like \U0001f431, but maybe there's an existing library which can handle those.

The code for the test itself is pretty simple, but it does not support all features yet:

use cel_interpreter::{Context, Program};
use cucumber::{given, then, when, World};

// `World` is your shared, likely mutable state.
// Cucumber constructs it via `Default::default()` for each scenario.
#[derive(Debug, Default, World)]
pub struct CelWorld {
    expression: String,
}

#[when(expr = "CEL expression \"{word}\" is evaluated")]
fn expression_evaluated_with_double_quotes(world: &mut CelWorld, expression: String) {
    world.expression = expression;
}

#[when(expr = "CEL expression '{word}\' is evaluated")]
fn expression_evaluated_with_single_quotes(world: &mut CelWorld, expression: String) {
    world.expression = expression;
}

#[then(regex = "value is (.*)")]
fn evaluation_result_is(world: &mut CelWorld, expected_result: String) {
    let program = Program::compile(&world.expression).unwrap();
    let mut context = Context::default();
    let result = program.execute(&context);

    assert_eq!(expected_result, format!("{:?}", result));
}

// This runs before everything else, so you can setup things here.
fn main() {
    // You may choose any executor you like (`tokio`, `async-std`, etc.).
    // You may even have an `async` main, it doesn't matter. The point is that
    // Cucumber is composable. :)
    futures::executor::block_on(CelWorld::run("tests/features"));
}

and the output looks like this

Feature: basic
  Scenario: self_eval_int_zero
   βœ”  When CEL expression "0" is evaluated
   βœ”  Then value is Ok(Int(0))
  Scenario: self_eval_uint_zero
   βœ”  When CEL expression "0u" is evaluated
   βœ”  Then value is Ok(UInt(0))
  Scenario: self_eval_float_zero
   βœ”  When CEL expression "0.0" is evaluated
   βœ”  Then value is Ok(Float(0.0))
  Scenario: self_eval_float_zerowithexp
   βœ”  When CEL expression "0e+0" is evaluated
   βœ”  Then value is Ok(Float(0.0))
  Scenario: self_eval_string_empty
   βœ”  When CEL expression "''" is evaluated
   βœ”  Then value is Ok(String(""))
  Scenario: self_eval_string_empty_quotes
   βœ”  When CEL expression '""' is evaluated
   βœ”  Then value is Ok(String(""))
  Scenario: self_eval_bytes_empty
   βœ”  When CEL expression 'b""' is evaluated
   βœ”  Then value is Ok(Bytes([]))
  Scenario: self_eval_bool_false
   βœ”  When CEL expression "false" is evaluated
   βœ”  Then value is Ok(Bool(false))
  Scenario: self_eval_null
   βœ”  When CEL expression "null" is evaluated
   βœ”  Then value is Ok(Null)
  Scenario: self_eval_empty_list
   βœ”  When CEL expression "[]" is evaluated
   βœ”  Then value is Ok(List([]))
  Scenario: self_eval_empty_map
   βœ”  When CEL expression "{}" is evaluated
   βœ”  Then value is Ok(Map(Map { map: {} }))
  Scenario: self_eval_int_nonzero
   βœ”  When CEL expression "42" is evaluated
   βœ”  Then value is Ok(Int(42))
  Scenario: self_eval_uint_nonzero
   βœ”  When CEL expression "123456789u" is evaluated
   βœ”  Then value is Ok(UInt(123456789))
  Scenario: self_eval_int_negative_min
   βœ”  When CEL expression "-9223372036854775808" is evaluated
   βœ”  Then value is Ok(Int(-9223372036854775808))
  Scenario: self_eval_float_negative_exp
   βœ”  When CEL expression "-2.3e+1" is evaluated
   βœ”  Then value is Ok(Float(-23.0))
  Scenario: self_eval_string_excl
   βœ”  When CEL expression '"!"' is evaluated
   βœ”  Then value is Ok(String("!"))
  Scenario: self_eval_string_escape
   βœ”  When CEL expression "'\''" is evaluated
   ✘  Then value is Ok(String("'"))
      Step failed:
      Defined: tests/features/basic.feature:125:5
      Matched: interpreter/tests/cel.rs:21:1
      Step panicked. Captured output: assertion failed: `(left == right)`
        left: `"Ok(String(\"'\"))"`,
       right: `"Ok(String(\"\\\\'\"))"`
  Scenario: self_eval_bytes_escape
   βœ”  When CEL expression "b'ΓΏ'" is evaluated
   βœ”  Then value is Ok(Bytes([195, 191]))
  Scenario: self_eval_bytes_invalid_utf8
   βœ”  When CEL expression "b'\000\xff'" is evaluated
   ✘  Then value is BytesType(source=b'\x00\xff')
      Step failed:
      Defined: tests/features/basic.feature:139:5
      Matched: interpreter/tests/cel.rs:21:1
      Step panicked. Captured output: assertion failed: `(left == right)`
        left: `"BytesType(source=b'\\x00\\xff')"`,
       right: `"Ok(Bytes([92, 48, 48, 48, 92, 120, 102, 102]))"`
  Scenario: self_eval_list_singleitem
   βœ”  When CEL expression "[-1]" is evaluated
   βœ”  Then value is Ok(List([Int(-1)]))
  Scenario: self_eval_map_singleitem
   βœ”  When CEL expression '{"k":"v"}' is evaluated
   βœ”  Then value is Ok(Map(Map { map: {String("k"): String("v")} }))
  Scenario: self_eval_bool_true
   βœ”  When CEL expression "true" is evaluated
   βœ”  Then value is Ok(Bool(true))
  Scenario: self_eval_int_hex
   βœ”  When CEL expression "0x55555555" is evaluated
   βœ”  Then value is Ok(Int(1431655765))
  Scenario: self_eval_int_hex_negative
   βœ”  When CEL expression "-0x55555555" is evaluated
   βœ”  Then value is Ok(Int(-1431655765))
  Scenario: self_eval_uint_hex
   βœ”  When CEL expression "0x55555555u" is evaluated
   βœ”  Then value is Ok(UInt(1431655765))
  Scenario: self_eval_unicode_escape_four
   βœ”  When CEL expression '"\u270c"' is evaluated
   ✘  Then value is Ok(String("✌"))
      Step failed:
      Defined: tests/features/basic.feature:188:5
      Matched: interpreter/tests/cel.rs:21:1
      Step panicked. Captured output: assertion failed: `(left == right)`
        left: `"Ok(String(\"✌\"))"`,
       right: `"Ok(String(\"\\\\u270c\"))"`
  Scenario: self_eval_unicode_escape_eight
   βœ”  When CEL expression '"\U0001f431"' is evaluated
   ✘  Then value is Ok(String("🐱"))
      Step failed:
      Defined: tests/features/basic.feature:195:5
      Matched: interpreter/tests/cel.rs:21:1
      Step panicked. Captured output: assertion failed: `(left == right)`
        left: `"Ok(String(\"🐱\"))"`,
       right: `"Ok(String(\"\\\\U0001f431\"))"`
  Scenario: self_eval_ascii_escape_seq
   βœ”  When CEL expression '"\a\b\f\n\r\t\v\"\'\\"' is evaluated
   ✘  Then value is String(\x07\x08\x0c\n\r\t\x0b"\'\\)
      Step failed:
      Defined: tests/features/basic.feature:202:5
      Matched: interpreter/tests/cel.rs:21:1
      Step panicked. Captured output: assertion failed: `(left == right)`
        left: `"String(\\x07\\x08\\x0c\\n\\r\\t\\x0b\"\\'\\\\)"`,
       right: `"Ok(String(\"\\\\a\\\\b\\\\f\\\\n\\\\r\\\\t\\\\v\\\\\\\"\\\\'\\\\\\\\\"))"`
  Scenario: self_eval_bound_lookup
   ?  Given type_env parameter "x" is TypeType(value='INT64')
      Step skipped: tests/features/basic.feature:211:4
  Scenario: self_eval_unbound_lookup
   βœ”  When CEL expression "x" is evaluated
   ?  Then eval_error is "undeclared reference to 'x' (in container '')"
      Step skipped: tests/features/basic.feature:225:5
  Scenario: unbound_is_runtime_error
   ?  When CEL expression "x || true" is evaluated
      Step skipped: tests/features/basic.feature:230:5
  Scenario: binop
   ?  When CEL expression "1 + 1" is evaluated
      Step skipped: tests/features/basic.feature:240:5
  Scenario: unbound
   βœ”  When CEL expression "f_unknown(17)" is evaluated
   ?  Then eval_error is 'unbound function'
      Step skipped: tests/features/basic.feature:249:5
  Scenario: unbound_is_runtime_error
   ?  When CEL expression "f_unknown(17) || true" is evaluated
      Step skipped: tests/features/basic.feature:254:5
  Scenario: false
   ?  Given type_env parameter "false" is TypeType(value='BOOL')
      Step skipped: tests/features/basic.feature:265:4
  Scenario: true
   ?  Given type_env parameter "true" is TypeType(value='BOOL')
      Step skipped: tests/features/basic.feature:278:4
  Scenario: null
   ?  Given type_env parameter "null" is TypeType(value='BOOL')
      Step skipped: tests/features/basic.feature:291:4
[Summary]
1 feature
37 scenarios (23 passed, 9 skipped, 5 failed)
67 steps (53 passed, 9 skipped, 5 failed)

from cel-rust.

clarkmcc avatar clarkmcc commented on July 30, 2024

@inikolaev this is fantastic! Definitely makes sense to incorporate into the project. I took a look at the proto-based tests but read an announcement from the maintainer about deprecating the protobuf dependency so I stopped pursuing that. This looks great however.

from cel-rust.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.