Git Product home page Git Product logo

Comments (7)

jheek avatar jheek commented on August 15, 2024

reshuffle_each_iteration is False by default? I guess this is a great example of why implicit random seeds are evil.

In the past I have also noticed that creating iterators in tf.data tends to come with significant overhead so I prefer to create one infinite iterator and use a manual for step in range(steps_per_epoch) to loop for an epoch. This also avoids some other issues like partial batches.

from flax.

AlexeyG avatar AlexeyG commented on August 15, 2024

Just checked, it actually does default to True:

reshuffle_each_iteration: (Optional.) A boolean, which if true indicates that the dataset should be pseudorandomly reshuffled each time it is iterated over. (Defaults to True.)

Perhaps the issue is that the nlp_seq example does not shuffle the data?

from flax.

AlexeyG avatar AlexeyG commented on August 15, 2024

Marking as PR welcome. A successful PR would also include (a link to) new TensorBoard logs verifying the metrics (see README.md in examples/nlp_seq for expected numbers).

from flax.

gan3sh500 avatar gan3sh500 commented on August 15, 2024

It looks like dataset.shuffle was just missing. I've made the change. I'll train on a V100 similar to the README and submit a PR with updated logs and code. I've checked other example folders. Others don't miss the shuffle to the training dataset. Please correct if I'm wrong.

from flax.

avital avatar avital commented on August 15, 2024

Awesome @gan3sh500 ! Indeed if you can reproduce the same results on a V100 and share a link to tensorboard.dev we'll merge your PR.

from flax.

gan3sh500 avatar gan3sh500 commented on August 15, 2024

Sorry for slow turnaround. I didn't get the V100 but I now have a 2080Ti locally to run and have ran the same batch_size and iterations. The results are 0.23% lower than the one shown in the repo. Please see for yourself. TB Dev #321

from flax.

marcvanzee avatar marcvanzee commented on August 15, 2024

It looks like @bohnetbd felt this wasn't worth adding due to the added complexity (given that this is an educational example). So I am closing this.

from flax.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.