Comments (21)
Yes I am starting on this now. We just need some pre-trained models to test on. If you can provide one, that would speed things up
from weightwatcher.
It would be helpful to have some pretrained LSTMs to test this on . Can you provide this ?
from weightwatcher.
Is there any update on the status of support for RNNCell
layers such as LSTM
, GRU
, etc? This would be quite the breakthrough as inferring the recurrent nonlinear dynamics of a network from the structure of its weight matrices is a very difficult open problem.
We just need some pre-trained models to test on.
It would be helpful to have some pretrained LSTMs to test this on . Can you provide this ?
There are quite a few in the collection of Keras examples, e.g.:
- https://keras.io/examples/nlp/bidirectional_lstm_imdb/
- https://keras.io/examples/vision/conv_lstm/
- https://keras.io/examples/generative/lstm_character_level_text_generation/
They should be fairly fast to run and get a trained model out of. Do you need something much larger w.r.t. the model or the dataset?
from weightwatcher.
I'll take a look again
We did try this once for but the internal GRU matrices did not have enough eigenvalues to say anything meaningful because the internal matrices in the cells were very rectangular, whereas this approach works better for matrices that have at least 50 eigenvalues
from weightwatcher.
I have a new idea to apply for this. If there is demand we can explore it
from weightwatcher.
I would love love love support for these layer types, and would be willing to contribute to this feature. What's the new idea, and could I be helpful at all in implementing it?
from weightwatcher.
from weightwatcher.
The LSTM approach requires different tecniques from RMT
To start this, I need some sample models with a wide range of accuracies
Any idea where to get these ?
from weightwatcher.
Yeah absolutely -- I'll happily train a couple different LSTMs and checkpoint them along the way. I'll try to do two or three different tasks.
I'll do a classic Karpathy-esque char-rnn (using LSTMs) like this https://github.com/JY-Yoon/RNN-Implementation-using-NumPy/blob/master/RNN%20Implementation%20using%20NumPy.ipynb
and then maybe some sort of time series forecasting task. I'll implement and point you to the repo in github when I have them trained.
Sound good?
from weightwatcher.
I've pushed a repo I trained last night with checkpoints along the way
https://github.com/dan-jacobson/example_lstms
from weightwatcher.
from weightwatcher.
No worries ! hope your arm heals quickly. I'll try those ideas in the next week or so.
from weightwatcher.
I ran an initial analysis on the LSTMs
The alphas actually increase with increasing epoch
from weightwatcher.
Including epoch 10, the correlation flow is very different
from weightwatcher.
Can you upload the LSTM initial state (before epoch 0) ?
Also, what stopping criteria did you use ?
from weightwatcher.
Ooh not sure I saved the initial random state. I can retrain and save the initial state this time.
Right now I use no stopping criteria -- just let it run for a while, then interrupted training after the samples (i'm printing out one each epoch) started to look reasonably coherent. Happy to introduce one if you'd like.
from weightwatcher.
like the spectral norm and the MP soft rank
Hi @dan-jacobson. Have you studied the applicability of other metrics, like the spectral norm and the MP soft rank.
from weightwatcher.
from weightwatcher.
yes the MP softrank is probably the correct metric for these systems the current implementation may not work well for LSTMs i can try to add it when i get back from vacation in a couple weeks in the meantime you can join the discord channel and ping me there
…
Sent from my iPhone
On Nov 9, 2022, at 5:40 PM, 9527-ly @.***> wrote: like the spectral norm and the MP soft rank Hi @dan-jacobson. Have you studied the applicability of other metrics, like the spectral norm and the MP soft rank. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.
Thank you. I will try. Have a nice holiday
from weightwatcher.
Any news on the state of the implementation for LSTM layers?
from weightwatcher.
from weightwatcher.
Related Issues (20)
- Add discussion on Noisy data
- bad issue HOT 1
- Appendix 6: Rederive Tanaka from scratch
- bad issue
- Rename SVDSmoothing to TruncatedSVD model everywhere
- Need to say something about the MLP3 model in the matgen section HOT 2
- Selective plotting HOT 1
- bug calling TPL
- update the docs on rescale_eigenvalues
- change detX constraint to use rescale_eigenvalues
- Any plans to support pytorch2.x? HOT 2
- Add ability to read layer_map for pytorch model bin files
- Bug in reading some models off the file system
- getWeights() can not pull all layer ids
- add torch.low_rank() option
- Can weightwatcher still be useful for model with RNN layers?
- add inverse option HOT 1
- WeightWatcher.ipynb 'numpy.ndarray' object has no attribute 'extend' HOT 1
- using alpha monitor the training process of the model HOT 5
- What does the layer_id refer to in the details df ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from weightwatcher.