Git Product home page Git Product logo

Comments (5)

mattsinc avatar mattsinc commented on July 30, 2024

The uninitialized reads in the valgrind trace are coming the w index in the array not being initialized in rnn_bench_rocm.cpp in DeepBench. I'm not sure what the expected value is here -- should w (dim[3]) just be set to 1? For what it's worth, if I set w to 1 for all of the above lines in rnn_bench_rocm.cpp, all of the uninitialized read errors go away but the assert failure still happens. So I suspect this is a second problem with the code -- I can push this change to DeepBench pending the answer about what value should be used.

from miopen.

dagamayank avatar dagamayank commented on July 30, 2024

@mattsinc are you using this repo - https://github.com/ROCmSoftwarePlatform/DeepBench?

from miopen.

mattsinc avatar mattsinc commented on July 30, 2024

@dagamayank, no I had been using the internal DeepBench-v1 repo (I was not aware of this one). Looks like there were some changes between the repos that may address at least the valgrind issue I highlighted above. It looks like the fixes @daniellowell mentioned in Slack for miopen_helper.h are also in there too.

I just clone'd this repo and ran it. Certainly the place where the failure is happening is not happening with this version, although I'm going to run a few more tests overnight to be sure. A perhaps dumb question: why did you all change only https://github.com/ROCmSoftwarePlatform/DeepBench/blob/master/code/amd/rnn_bench_rocm.cpp#L48-L51, and not change https://github.com/ROCmSoftwarePlatform/DeepBench/blob/master/code/amd/rnn_bench_rocm.cpp#L52-L59? It seems like the same change would be needed for all of them, but I don't see the assert trigger for lines 52-59, so perhaps there's something subtle I'm not understanding.

Thanks,
Matt

from miopen.

daniellowell avatar daniellowell commented on July 30, 2024

From our documentation:

hxDesc: A hidden tensor descriptor that has as its first dimension of the number of layers if the direction mode is unidirectional and twice the number of layers if the direction mode is bidirectional. The second dimension of the descriptor must equal the largest first dimension of the xDesc tensor descriptor array. The third dimension equals the hiddenSize. (input)

Because in Deepbench we are using a single layer, the first element of the tuple in the first argument in the lines for 52-59 is 1.

xDesc: An array of tensor descriptors. These are the input descriptors to each time step. The first dimension of each descriptor is the batch size and may decrease from element n to element n+1 and not increase in size. The second dimension is the same for all descriptors in the array and is the input vector length. (input)

The lines 48-51 are arrays of data tensor descriptors. The first tuple argument is batch_size, hidden_size. In Deepbench hidden_size = input_vector_size, so we use this pair. The array size is the number of timesteps.

Let's just use https://github.com/ROCmSoftwarePlatform/DeepBench for further testing. We'll depreciate the other repo. I'm closing this issue, but if you are still having problems on this topic I can reopen it.

Thanks for testing our Deepbench, let me know if you have any other issues.

from miopen.

mattsinc avatar mattsinc commented on July 30, 2024

@daniellowell, @dagamayank, thanks for the help and info.

from miopen.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.