Git Product home page Git Product logo

Comments (7)

jgalliers avatar jgalliers commented on May 18, 2024 1

I didn't realise until just now but this issue is actually another example of the issue that I opened today - #343

The underlying issue here is not actually map differences in a slice - it's the magnitude of the change. You correctly identify here that the quantity of changed fields directly affects the result, and that's because the diff has tipped over the "half-way" mark (more than half of ANY fields on the object being diffed has changed) and this converts the diff from a "field-wise" diff to a "new object" diff.

I'm not sure why this is the case and hopefully we get an answer on #343

from go-cmp.

dsnet avatar dsnet commented on May 18, 2024

Much of the reporter is using a complex set of heuristic to determine how to best format the output. Reports like this and #343 are helpful to adjust the heuristic to provide a better human experience. It will unfortunately never be perfect as much of this subjective.

from go-cmp.

dsnet avatar dsnet commented on May 18, 2024

I traced down heuristic for both this issue and #343 to this code snippet:
https://github.com/google/go-cmp/blob/master/cmp/internal/diff/diff.go#L112-L120

Changing it to something like:

func (r Result) Similar() bool {
	// Use NumSame+1 to offset NumSame so that binary comparisons are similar.
	return r.NumSame+1 >= r.NumDiff/2
}

fixes the issue, but we'll have to think carefully about possible unexpected transitive effects.

from go-cmp.

dsnet avatar dsnet commented on May 18, 2024

Here's an example of a diff that got worse:

got:
    [][]uint8{
-       {0xde, 0xad, 0xbe, 0xef},
        {0xff, 0x6f, 0x6f},
+       "foo",
        "barbaz",
        bytes.Join({
-           "bl",
            "a",
-           "hdieblah",
+           "dded",
        }, ""),
+       "here",
+       {0x68, 0x72, 0x6d, 0x70, 0x68, 0xff},
    }

want:
    [][]uint8{
-       {0xde, 0xad, 0xbe, 0xef},
        {0xff, 0x6f, 0x6f},
+       "foo",
        "barbaz",
+       "added",
+       "here",
-       "blahdieblah",
+       {0x68, 0x72, 0x6d, 0x70, 0x68, 0xff},
    }

from go-cmp.

jgalliers avatar jgalliers commented on May 18, 2024

I traced down heuristic for both this issue and #343 to this code snippet: https://github.com/google/go-cmp/blob/master/cmp/internal/diff/diff.go#L112-L120

Changing it to something like:

func (r Result) Similar() bool {
	// Use NumSame+1 to offset NumSame so that binary comparisons are similar.
	return r.NumSame+1 >= r.NumDiff/2
}

fixes the issue, but we'll have to think carefully about possible unexpected transitive effects.

Could we pass some kind of option that allows us to manage the evaluation? We can already transform, order, etc via these (very helpful!) functions, could we do something with that?

I'm not an expert in the deep internals here but would it potentially be possible to do something with the editscript generation and adding some kind of new EditType? Perhaps through a new cmp.Option() such as SetSameObjectHeuristic(int) in which the Similar() function you linked to either the passed value?

Or a more flexible option like SetSameObjectHeuristic(interface{}, int) which allows a type/value combination to specify a value to assess the "sameness".

I actually don't have a problem at all with the heuristic (it seems like a smart solution to a difficult problem!) - but being able to indicate you want a full property-wise diff would be very valuable in my opinion.

from go-cmp.

mrwonko avatar mrwonko commented on May 18, 2024

I wouldn't mind showing the entire object as changed, as long as the entire object is printed, and not just the first couple of fields (which may be identical). Make it a cmpopt, if you'd rather make it opt-in.

from go-cmp.

dsnet avatar dsnet commented on May 18, 2024

If you look at the implementation, it's a lot of hardcoded constants and heuristics. It is infeasible to expose all of those details through options. It's better to look at case examples of where it operates poorly and tune the heuristics to do better in those cases without undoing how it works on other cases. It takes some finessing, but the reporter gradually do better for more and more situations. Until we've exhausted all the reasonable heuristic tuning, I don't think we should expose any options to alter reporter outputer.

from go-cmp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.