Git Product home page Git Product logo

Comments (11)

witwolf avatar witwolf commented on July 26, 2024

Random seed is not fixed for environment , It is a possible reason that train test fails sometimes.
Any other case fails except TrainTest ?

from alf.

witwolf avatar witwolf commented on July 26, 2024

And there are other reasons:

  1. multiple threads are used for parallelism between independent operations in tf
  2. the order of running op are uncertain (when they have no dependencies)

perharps we should set tf.config.threading.set_inter_op_parallelism_threads(1) for unittest

from alf.

hnyu avatar hnyu commented on July 26, 2024

Random seed is not fixed for environment , It is a possible reason that train test fails sometimes.
Any other case fails except TrainTest ?

Sometimes the SAC case will also fail.

from alf.

hnyu avatar hnyu commented on July 26, 2024

And there are other reasons:

  1. multiple threads are used for parallelism between independent operations in tf
  2. the order of running op are uncertain (when they have no dependencies)

perharps we should set tf.config.threading.set_inter_op_parallelism_threads(1) for unittest

I think for unittest, to avoid stochasticity introduced by parallelism, we can set num_envs=1 and not use async-off policy training.

from alf.

witwolf avatar witwolf commented on July 26, 2024

It's hard to make the training have deterministic result, i did some experiments
with fixed seed for tf and environments and set inter_op_parallelism_threads to 1 , the result shows it still has the probability of getting different results

Personally think, it's ok when some unittest fails

from alf.

hnyu avatar hnyu commented on July 26, 2024

It's hard to make the training have deterministic result, i did some experiments
with fixed seed for tf and environments and set inter_op_parallelism_threads to 1 , the result shows it still has the probability of getting different results

Personally think, it's ok when some unittest fails

OK, I thought the reason why we changed unittest to tf.unittest is because of the determinism it provides. Then maybe next time the test threshold should be less strict.

from alf.

hnyu avatar hnyu commented on July 26, 2024

So if everything has fixed random seeds, then the only stochasticity is from CPU scheduling for parallelism, right? What about we use eager mode for unittests?

from alf.

witwolf avatar witwolf commented on July 26, 2024

So if everything has fixed random seeds, then the only stochasticity is from CPU scheduling for parallelism, right? What about we use eager mode for unittests?

Yes , the only stochasticity is from CPU scheduling for parallelism, it affect the generation of random numbers. I have tried using eager mode for train test with only 1 thread, but it does not make deterministic result (still do not know the reason)

from alf.

hnyu avatar hnyu commented on July 26, 2024

So if everything has fixed random seeds, then the only stochasticity is from CPU scheduling for parallelism, right? What about we use eager mode for unittests?

Yes , the only stochasticity is from CPU scheduling for parallelism, it affect the generation of random numbers. I have tried using eager mode for train test with only 1 thread, but it does not make deterministic result (still do not know the reason)

Hmm.. Interesting. @emailweixu Do you have any insight into this?

from alf.

emailweixu avatar emailweixu commented on July 26, 2024

Using eager mode will make the test much longer.
Perhaps the game itself has some randomness inside?

from alf.

hnyu avatar hnyu commented on July 26, 2024

Using eager mode will make the test much longer.
Perhaps the game itself has some randomness inside?

I think @witwolf tried setting the seeds of environments deterministically. Even so, the results are nondeterministic.

from alf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.