Git Product home page Git Product logo

Comments (15)

MischaPanch avatar MischaPanch commented on August 17, 2024 2

Yes, seems to be quite involving at this point. I wonder how torch is able to do it:
Torch an numpy can provide "views" of the arrays, so the id is the same.

As I found out just now, python's own list actually cannot do this, so

l = [1, 2, 3]
id(l[:2]) == id(l[:2])
>>> False

Since Batch is more of a python object than an array, it's fine if we don't do better than list :). Let's just implement __eq__ properly and resolve this issue

from tianshou.

MischaPanch avatar MischaPanch commented on August 17, 2024 1

In the last months I implemented a lot of helper things that also could help with this issue. Gonna open a PR tomorrow and assign you two as reviewers

from tianshou.

dantp-ai avatar dantp-ai commented on August 17, 2024

I can look into this. I think the example above has a small typo. It should be id2 = id(b[0]).

from tianshou.

MischaPanch avatar MischaPanch commented on August 17, 2024

Yes, you're right about the typo.

From all batch issues this might be the hardest one. I'm not sure how it can be solved at all, tbh.

from tianshou.

dantp-ai avatar dantp-ai commented on August 17, 2024

Interesting:

>>> b = Batch(a=Batch(a=[1, 2, 3]))
>>> id1 = id(b[1])
>>> id2 = id(b[1])
>>> id1 == id2
True
>>> b[1]
Batch(
    a: Batch(
           a: 2,
       ),
)

from tianshou.

MischaPanch avatar MischaPanch commented on August 17, 2024

Even more confusing, since for batches with only subbatches getitem does work as expected, but if a sequence is involved it creates a new object:

b = Batch(a=[1, 2, 3])
b[0] == b[0]
>>> False

from tianshou.

MischaPanch avatar MischaPanch commented on August 17, 2024

Note that if there is a solution, it should also work for slices. Right now

b[:2] == b[:2]
>>> False

One idea: we likely can't make it return the same object, but we could add __eq__ to batch to at least not have the misleading euqalities. This would actually be almost trivial to do! You could just compare the sorted wrapped __dict__

from tianshou.

dantp-ai avatar dantp-ai commented on August 17, 2024

One idea: we likely can't make it return the same object,

Yes, seems to be quite involving at this point. I wonder how torch is able to do it:

>>> a = torch.Tensor([[1, 2, 3], [4, 5, 6], [7, 8 , 9]])
>>> id1 = id(a[0, :2])
>>> id2 = id(a[0, :2])
>>> id1 == id2
True
>>> a[0, :2]
tensor([1., 2.])

but we could add eq to batch to at least not have the misleading equalities

Yes, this sounds good. I'll try this out. I don't think it would hurt later if we do find a solution for the object equality.

from tianshou.

MischaPanch avatar MischaPanch commented on August 17, 2024

Huh, actually, I was slightly wrong but in a weird way. There seems some magic happening when a var is assigned to id of a list view.. Anyhow, the id of python list slices is not completely fixed

from tianshou.

MischaPanch avatar MischaPanch commented on August 17, 2024

For reference: the objects returned on getitem still have different ids. This issue was resolved by implementing __eq__ on batch, which permits a meaningful comparison of the returned objects

from tianshou.

maxhuettenrauch avatar maxhuettenrauch commented on August 17, 2024

I just had the case where I wanted to compare two batches that contained torch distributions logged during the training process. This comparison fails with a TypeError: iteration over a 0-d tensor. Should __eq__ also work for non array/tensor data?

from tianshou.

dantp-ai avatar dantp-ai commented on August 17, 2024

Thx for spotting it! It should indeed work. There are some tests that cover this, but as I was digging into it I noticed that it fails for some other cases, e.g:

In [41]: b1 = Batch(a={"b": 1})

In [42]: b2 = Batch(a={"b": 1})

In [43]: b1 == b2
Out[43]: True

In [44]: b2 = Batch(a={"c": 2})

In [45]: b1 == b2
Out[45]: False

In [46]: b2 = Batch(b={"c": 2})

In [47]: b1 == b2
Out[47]: False

In [48]: b2 = Batch(b={"b": 1})

In [49]: b1 == b2
Out[49]: False

In [50]: b2 = Batch(a={"b": 10})

In [51]: b1 == b2
...
    682 """
    683 Default compare if `iterable_compare_func` is not provided.
    684 This will compare in sequence order.
    685 """
    686 if t1_from_index is None:
    687     return [((i, i), (x, y)) for i, (x, y) in enumerate(
--> 688         zip_longest(
    689             level.t1, level.t2, fillvalue=ListItemRemovedOrAdded))]
    690 else:
    691     t1_chunk = level.t1[t1_from_index:t1_to_index]

TypeError: iteration over a 0-d array

I will look into it asap. I apologize for the inconvenience.

EDIT:

  • Seems that DeepDiff cannot compare out-of-the-box two zero-dimensional arrays containing a scalar value (cf. seperman/deepdiff#463).

from tianshou.

dantp-ai avatar dantp-ai commented on August 17, 2024

@maxhuettenrauch So far it seems that the issue is when dealing with zero-dimensional arrays.

To remain flexible wrt to DeepDiff's, I suggest that we perform an additional processing step in Batch.__eq__ that is using the convenient numpy method numpy.atleast_1d to recursively convert in the Batch any scalar inputs to 1-dimensional arrays (while preserving all other arrays).

from tianshou.

dantp-ai avatar dantp-ai commented on August 17, 2024

@MischaPanch Should I go ahead with the proposal above? Or does one of your helper methods already cover this edge case?

from tianshou.

dantp-ai avatar dantp-ai commented on August 17, 2024

@MischaPanch I experimented today with the new Batch API (#1181), specifically Batch.apply_values_transform which I can use with np.atleast_1d to transform any of the 0-dimensional arrays to 1-dimension. Then DeepDiff should work fine with this edge case when checking for batch equality.

from tianshou.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.