🐛 Describe the bug import torch torch.use_deterministic_algor

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

This does not repro for me on linux: <a target="_blank" rel="noopener noreferrer"

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Linear is not deterministic even using deterministic algorithms and cpu about pytorch HOT 10 CLOSED

michaelshekasta commented on July 22, 2024

Linear is not deterministic even using deterministic algorithms and cpu

from pytorch.

Comments (10)

michaelshekasta commented on July 22, 2024 1

@kurtamohler
Thank you for responses!!! You're truly incredible!

from pytorch.

janeyx99 commented on July 22, 2024

This does not repro for me on linux:

from pytorch.

michaelshekasta commented on July 22, 2024

@janeyx99 Thank you for your quick response! After testing the code, I observed that it occasionally returns True and sometimes False. To further investigate this behaviour, I executed the following code snippet:

import torch
torch.use_deterministic_algorithms(True)
for i in tqdm(range(2, 1000)):
    a = torch.rand(i+1, i)
    b = torch.rand(i, i)
    if not torch.equal((a @ b)[0:1, :], (a[0:1, :]) @ b):
        print(i)
        break

In my environment, the code prints 16.

from pytorch.

michaelshekasta commented on July 22, 2024

@janeyx99 running on colab.

from pytorch.

kurtamohler commented on July 22, 2024

@michaelshekasta, while the two expressions (a @ b)[0:1, :] and (a[0:1, :]) @ b may be mathematically identical, they cannot be guaranteed to produce the same result because summations may occur in a different order. The same goes for the two expressions linear(a)[0, :] and linear(a[0, :]).

torch.use_deterministic_algorithms is only meant to provide determinism for multiple calls to the exact same operation given the exact same numerical arguments.

From the documentation here:

That is, algorithms which, given the same input, and when run on the same software and hardware, always produce the same output.

from pytorch.

michaelshekasta commented on July 22, 2024

Hey @kurtamohler, I want to make sure I fully understand you. Can the same thing happen in linear layers as well? (Instead of using @)

from pytorch.

kurtamohler commented on July 22, 2024

Yes. With linear(a) and linear(a[0, :]), you are giving two different inputs, so torch.use_deterministic_algorithms does not guarantee that they give the same results.

from pytorch.

michaelshekasta commented on July 22, 2024

@kurtamohler does pytorch has a documentation how does the Linear work? I mean why does it not the same result?

from pytorch.

kurtamohler commented on July 22, 2024

The documentation for Linear doesn't explain this--aside from just saying that Linear performs a vector-matrix multiplication, which implicitly requires summation.

A general fact about floating point numbers is that when two of them are added together, the result has a small amount of error that depends on the difference between the two numbers. So if a set of floating point numbers is summed in two different orders, the errors can accumulate differently and give two slightly different summation results.

Any operator in PyTorch that sums elements of a tensor together may perform the summation in a different order depending on the size of the input. There are multiple possible reasons for this (like performance)--it depends on the implementation of the operator.

With torch.use_deterministic_algorithms, we can guarantee that a given operator will perform summations in the same order each time that it is given the same exact input. But we don't (and probably can't) enforce the same order of summation for two different inputs of different sizes

from pytorch.

kurtamohler commented on July 22, 2024

Happy to help!

Closing this, since different inputs are expected to give different results

from pytorch.

Linear is not deterministic even using deterministic algorithms and cpu about pytorch HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent