I am trying to use the orthogonal constrain in a hyernetwork. So effectively I have a

Hope this example explains what I am trying to do <a href="https://colab.research.goog

Usage of geotorch in Hypernetwork about geotorch HOT 9 CLOSED

lezcano commented on August 23, 2024

Usage of geotorch in Hypernetwork

from geotorch.

Comments (9)

lezcano commented on August 23, 2024

I do not think I fully understand the problem here. Can you provide a minimal working example of this behaviour?

from geotorch.

JonathanSchmidt1 commented on August 23, 2024

Hope this example explains what I am trying to do https://colab.research.google.com/drive/16hiNR8TO1M5uJvw_nTChGjWVtx6OsB3Y?usp=sharing

from geotorch.

lezcano commented on August 23, 2024

Could you make the colab public?

from geotorch.

JonathanSchmidt1 commented on August 23, 2024

Sorry, does it work for you now?

from geotorch.

lezcano commented on August 23, 2024

It does now, thanks. I'll take a look at it in a bit.

from geotorch.

JonathanSchmidt1 commented on August 23, 2024

Thought about the problem a bit more and it might be the best idea to just add the parameterization of the orthogonal matrix as part of the hyperparameter instead of doing as part of geotorch, (added it to the notebook). If you want you can close the Issue, Thank you for your quick help.
Ps:
Is the cayley or the matrix exponential representation more efficient?

from geotorch.

lezcano commented on August 23, 2024

Hi, sorry, I have not had time to look into this today.
I just had a look, and it turns out that what's happening is that every time that you put the constraint, the constraint is initialised. Therefore the randomness. If you put an orthogonal constraint on a linear layer nn.Linear(3,3), the default weight from this layer is not orthogonal, and it is not clear which orthogonal weight it should be initialised to. As such, when you do geotorch.orthogonal(layer, "weight"), it initialises layer.weight to a randomly sampled orthogonal matrix. See:
https://geotorch.readthedocs.io/en/latest/orthogonal/stiefel.html#geotorch.Stiefel.sample
The sampling happens here (see line 22)
https://github.com/Lezcano/geotorch/blob/55a91ca973ddeeb12d193d464af4b23da93d5ab4/geotorch/constraints.py#L18-L23
I hope this helps you understanding a bit better the inner workings of GeoTorch! I think I should make this a bit more explicit somewhere.

If you do not like this behaviour, you can always initialise your layer to whichever orthogonal matrix you like by doing

layer = nn.Linear(3, 3)
geotorch.orthogonal(layer, "weight")
X = my_sampling_method(3, 3)
layer.weight = X

About the Cayley vs Exponential, the exponential map, as it is implemented in PyTorch, is more efficient than the Cayley map. That being said, I'd suggest you time them for the sizes that you are going to use with timit on a Jupyter notebook, just to be on the safe side.

from geotorch.

lezcano commented on August 23, 2024

As a separate minor point, I would recommend you to use geotorch's methods over implementing them yourself, as they have a number of small optimisations and little quirks that you can benefit from at zero cost. It also has the advantage that it's more explicit with the intentions of what you want to do, but that's another story :)

from geotorch.

JonathanSchmidt1 commented on August 23, 2024

Thank you for your help, I thought that it was a problem like this with geotorch but I did not see how to fix it, now I should be able to bring it back into the code

from geotorch.

Usage of geotorch in Hypernetwork about geotorch HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent