Git Product home page Git Product logo

Comments (21)

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

Yes I am starting on this now. We just need some pre-trained models to test on. If you can provide one, that would speed things up

from weightwatcher.

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

It would be helpful to have some pretrained LSTMs to test this on . Can you provide this ?

from weightwatcher.

arvoelke avatar arvoelke commented on July 17, 2024

Is there any update on the status of support for RNNCell layers such as LSTM, GRU, etc? This would be quite the breakthrough as inferring the recurrent nonlinear dynamics of a network from the structure of its weight matrices is a very difficult open problem.

We just need some pre-trained models to test on.
It would be helpful to have some pretrained LSTMs to test this on . Can you provide this ?

There are quite a few in the collection of Keras examples, e.g.:

They should be fairly fast to run and get a trained model out of. Do you need something much larger w.r.t. the model or the dataset?

from weightwatcher.

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

I'll take a look again
We did try this once for but the internal GRU matrices did not have enough eigenvalues to say anything meaningful because the internal matrices in the cells were very rectangular, whereas this approach works better for matrices that have at least 50 eigenvalues

from weightwatcher.

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

I have a new idea to apply for this. If there is demand we can explore it

from weightwatcher.

dan-jacobson avatar dan-jacobson commented on July 17, 2024

I would love love love support for these layer types, and would be willing to contribute to this feature. What's the new idea, and could I be helpful at all in implementing it?

from weightwatcher.

Alamwealthkid avatar Alamwealthkid commented on July 17, 2024

from weightwatcher.

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

The LSTM approach requires different tecniques from RMT

To start this, I need some sample models with a wide range of accuracies

Any idea where to get these ?

from weightwatcher.

dan-jacobson avatar dan-jacobson commented on July 17, 2024

Yeah absolutely -- I'll happily train a couple different LSTMs and checkpoint them along the way. I'll try to do two or three different tasks.

I'll do a classic Karpathy-esque char-rnn (using LSTMs) like this https://github.com/JY-Yoon/RNN-Implementation-using-NumPy/blob/master/RNN%20Implementation%20using%20NumPy.ipynb

and then maybe some sort of time series forecasting task. I'll implement and point you to the repo in github when I have them trained.

Sound good?

from weightwatcher.

dan-jacobson avatar dan-jacobson commented on July 17, 2024

I've pushed a repo I trained last night with checkpoints along the way

https://github.com/dan-jacobson/example_lstms

from weightwatcher.

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

from weightwatcher.

dan-jacobson avatar dan-jacobson commented on July 17, 2024

No worries ! hope your arm heals quickly. I'll try those ideas in the next week or so.

from weightwatcher.

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

I ran an initial analysis on the LSTMs

The alphas actually increase with increasing epoch

Screen Shot 2022-10-23 at 10 44 13 AM

Screen Shot 2022-10-23 at 10 44 19 AM

Screen Shot 2022-10-23 at 10 44 23 AM

from weightwatcher.

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

Including epoch 10, the correlation flow is very different

Screen Shot 2022-10-23 at 10 50 40 AM

from weightwatcher.

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

Can you upload the LSTM initial state (before epoch 0) ?

Also, what stopping criteria did you use ?

from weightwatcher.

dan-jacobson avatar dan-jacobson commented on July 17, 2024

Ooh not sure I saved the initial random state. I can retrain and save the initial state this time.

Right now I use no stopping criteria -- just let it run for a while, then interrupted training after the samples (i'm printing out one each epoch) started to look reasonably coherent. Happy to introduce one if you'd like.

from weightwatcher.

9527-ly avatar 9527-ly commented on July 17, 2024

like the spectral norm and the MP soft rank

Hi @dan-jacobson. Have you studied the applicability of other metrics, like the spectral norm and the MP soft rank.

from weightwatcher.

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

from weightwatcher.

9527-ly avatar 9527-ly commented on July 17, 2024

yes the MP softrank is probably the correct metric for these systems the current implementation may not work well for LSTMs i can try to add it when i get back from vacation in a couple weeks in the meantime you can join the discord channel and ping me there

Sent from my iPhone
On Nov 9, 2022, at 5:40 PM, 9527-ly @.***> wrote:  like the spectral norm and the MP soft rank Hi @dan-jacobson. Have you studied the applicability of other metrics, like the spectral norm and the MP soft rank. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.

Thank you. I will try. Have a nice holiday

from weightwatcher.

bkocis avatar bkocis commented on July 17, 2024

Any news on the state of the implementation for LSTM layers?

from weightwatcher.

charlesmartin14 avatar charlesmartin14 commented on July 17, 2024

from weightwatcher.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.