Git Product home page Git Product logo

Comments (4)

dkobak avatar dkobak commented on August 26, 2024

To be more precise:

only one core is active during the gradient descent.

We don't really know that. We are looking at the htop and it shows that almost all cores are used by 15% or so. It almost looks like they are not being used at all, but who knows - maybe they are?..

For comparison, during the Annoy loop, all cores are working with 100%.

from fit-sne.

dkobak avatar dkobak commented on August 26, 2024

Another update: if we set perplexity a lot higher (e.g. perplexity=500), then all cores are working at 30-40%, so it becomes clear that the process is actually multithreaded...

PS. In all these cases the sample size is moderately small n=11k.

from fit-sne.

linqiaozhi avatar linqiaozhi commented on August 26, 2024

@dkobak @msayhan With n=11k, the bottleneck in the gradient descent is the FFT, which is not multithreaded (multithreading the FFT does not give much speed up for the typical number interpolation points and boxes we use, and requires the users to compile FFTW with flags, complicating the install). So, it's going to be hard to really see the speed-up on the non-FFT parts, or even to catch the split second all cores are working before all the threads finish and we are back in the FFT again.

Increasing perplexity was a great way to check it though. But perhaps even simpler is to just try it with a large N. In that case, the bottleneck becomes the attractive forces, and that should parallelize nicely. You still might not catch all cores at 100% (because each iteration is so fast and they have to go to 0% between iterations), but you should definitely see some multicore action happening. I typically just use top, and if the CPU usage for the fast_tsne process exceeds 100% I know that it is multithreading.

from fit-sne.

dkobak avatar dkobak commented on August 26, 2024

This makes sense. In fact, I think @msayhan was observing CPU usage less than 100% in top, that's exactly what confused us and made us post this issue. But as soon as he increased N (or perplexity), the usage went above 100%.

Thanks for the explanations! I close this now.

from fit-sne.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.