Git Product home page Git Product logo

Comments (13)

AugustoFKL avatar AugustoFKL commented on August 24, 2024 2

@luke-biel @frondeus I'm working on this issue, FYI.

from test-case.

luke-biel avatar luke-biel commented on August 24, 2024 2

I'm a bit worried with ergonomics of this. How will this look like in the test output? If we were dealing with custom test harness we could write there anything, but in our scenario we have to rely on rust's mechanisms - and these display module path. Appending something at the end (or beginning) of the name would be a bit less confusing, even if it was that encoded string (though probably a bit long), test_case line number or hashed "something". Basically, out of all these options we have to choose something which will give most information to the user while avoiding name conflicts.

from test-case.

IohannRabeson avatar IohannRabeson commented on August 24, 2024 1

Adding an arbitrary index is what does rstest, you can see that in this example.
But rstest is little bit less ergonomic because you have to annotate parameters with #[case].

from test-case.

frondeus avatar frondeus commented on August 24, 2024

synek317/test-case-derive#1

from test-case.

AugustoFKL avatar AugustoFKL commented on August 24, 2024

Sorry the delay

@frondeus, @luke-biel, after some research, I figured that the main problem is how test names are handled by the escape_test_name method:

pub fn escape_test_name(input: impl AsRef<str>) -> Ident {
    if input.as_ref().is_empty() {
        return Ident::new("_empty", Span::call_site());
    }

    let mut last_under = false;
    let mut ident: String = input
        .as_ref()
        .to_ascii_lowercase()
        .chars()
        .filter_map(|c| match c {
            c if c.is_alphanumeric() => {
                last_under = false;
                Some(c.to_ascii_lowercase())
            }
            _ if !last_under => {
                last_under = true;
                Some('_')
            }
            _ => None,
        })
        .collect();

    if !ident.starts_with(|c: char| c == '_' || c.is_ascii_alphabetic()) {
        ident = format!("_{ident}");
    }

    Ident::new(&ident, Span::call_site())
}

There are two things that can cause problems, IMHO: The "None" of the match inside the filter_map and the if at before returning.

I imagine that the _ is added before the test name to avoid test names starting with invalid characters for test names, is that correct?

To be honest I only found two possible solutions to the general problem of conflicting names:

  1. The library start using additional context information to generate test names.
  2. We add an prefix/postfix to all tests to guarantee the uniqueness of names.

My proposal is to start adding an UUID to the end of all test names. This would be the less disruptive change when it comes to the current codebase. Adding contextual information would take a lot more effort and time, so the UUID would work as a bandaid to the problem in the meantime.

from test-case.

luke-biel avatar luke-biel commented on August 24, 2024

Hmm, I'm not sold on idea of adding an UUID, mainly because it's not stable, it'd change for every compilation and tools would not understand it (cargo would report to rerun a test with --test test_name_my_old_uuid, but next run would require "new_uuid"). We need something stable. May I suggest span or argument hashing (eg.: SHA1)?

from test-case.

AugustoFKL avatar AugustoFKL commented on August 24, 2024

@luke-biel I agree.

Actually, instead of hashing, I thought about using enconding:

fn encode_test_name(name: &str) -> String {
    // Encode the normalized string to match the allowed characters in the test name
    let mut encoded = BASE64_URL_SAFE_NO_PAD.encode(name.as_bytes());

    // Replace not acceptable characters with an identifiable string
    encoded = encoded.replace('-', "__dash__");

    // Add an underscore to the beginning of the string to make it a valid Rust
    // identifier when starting with a digit.
    encoded.insert(0, '_');

    encoded
}

fn decode_test_name(name: &str) -> Result<String> {
    // Remove the underscore from the beginning of the string
    let name = name.trim_start_matches('_');

    // Reverse the replacement of not acceptable characters
    let name = name.replace("__dash__", "-");

    // Decode the test name
    let decoded = BASE64_URL_SAFE_NO_PAD.decode(name.as_bytes())?;

    // Convert the decoded bytes to a string
    let decoded_str = String::from_utf8(decoded)?;

    Ok(decoded_str)
}

This code generates names based on the BASE64 encoding. I was able to verify and it supports the whole unicode specification (from 0x0000 to 0xFFFF), besides surrogates. Additionally, the encoded values are guaranteed to be unique, which means that will be no more problems with 'A' or 'a' or anything like that.

My only concern is that the names being encoded makes the tracing from the test results to actual tests pretty difficult. Therefore, I am looking into having a way to inform the 'name' (the one given by the user, if any, or the arguments that were used).

Does that sound reasonable?

Does that sound acceptable?

from test-case.

olson-sean-k avatar olson-sean-k commented on August 24, 2024

Just wanted to mention that I've also encountered this issue, though with &str arguments:

#[test_case("a*")]
#[test_case("*a")]
fn test(expression: &str) { ... }

It's very easy for this to occur with text, since it isn't too unlikely that it won't constitute a valid Rust identifier. A minimal tag derived from parameters sounds like a great idea, especially if it's stable. Maybe a truncated hash could provide enough disambiguation. From the example above, I think names like this could work well:

tests::test::_f8c61a7__a_expects
tests::test::_0b548fa__a_expects

That isn't too noisy in my opinion and the original format remains intact. Of course, it may be difficult to differentiate tests like this at a glance, but I think it's reasonable to lean on meaningful output when such tests fail to make the distinction sufficiently clear.

Thanks for looking into this!

from test-case.

IohannRabeson avatar IohannRabeson commented on August 24, 2024

Why not just adding an arbitrary index after the name?

tests::test::__a_0
tests::test::__a_1
tests::test::__b_2
tests::test::__b_3

It seems enough to make a distinction. I don't think there is a need for a hash.

from test-case.

luke-biel avatar luke-biel commented on August 24, 2024

Hmm, thinking more about it, encoding as per your suggestion @AugustoFKL would work, one caveat as @olson-sean-k mentioned, we'd have to cover a lot more than just dash. Most likely all non-ascii characters. Maybe use urlencode then? But we'd need to do something with % characters then.
Number appending works fine too, we can just add index in test_case vector to the function name.

To be honest, I'm fine with any solution as long as it's consistent between runs. Be it hashing, truncated hashes, indexes...

from test-case.

frondeus avatar frondeus commented on August 24, 2024

Silly idea - what about line numbers :)?

from test-case.

IohannRabeson avatar IohannRabeson commented on August 24, 2024

Silly idea - what about line numbers :)?

Line numbers are not very stable, any change in the same file above the test can change them.

from test-case.

olson-sean-k avatar olson-sean-k commented on August 24, 2024

Why not just adding an arbitrary index after the name?

Assuming the cases can be stably enumerated in the procedural macro implementation, I think a case index is the best suggestion. By default, a very simple but reliable scheme is probably best. Anything else should probably be opt-in.

Sometimes the identifiers constructed from case parameters work pretty well... but often they aren't very legible and sometimes they break! It doesn't generalize and so doesn't make for a good default, I think.

from test-case.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.