Git Product home page Git Product logo

Comments (9)

lezcano avatar lezcano commented on August 23, 2024

I do not think I fully understand the problem here. Can you provide a minimal working example of this behaviour?

from geotorch.

JonathanSchmidt1 avatar JonathanSchmidt1 commented on August 23, 2024

Hope this example explains what I am trying to do https://colab.research.google.com/drive/16hiNR8TO1M5uJvw_nTChGjWVtx6OsB3Y?usp=sharing

from geotorch.

lezcano avatar lezcano commented on August 23, 2024

Could you make the colab public?

from geotorch.

JonathanSchmidt1 avatar JonathanSchmidt1 commented on August 23, 2024

Sorry, does it work for you now?

from geotorch.

lezcano avatar lezcano commented on August 23, 2024

It does now, thanks. I'll take a look at it in a bit.

from geotorch.

JonathanSchmidt1 avatar JonathanSchmidt1 commented on August 23, 2024

Thought about the problem a bit more and it might be the best idea to just add the parameterization of the orthogonal matrix as part of the hyperparameter instead of doing as part of geotorch, (added it to the notebook). If you want you can close the Issue, Thank you for your quick help.
Ps:
Is the cayley or the matrix exponential representation more efficient?

from geotorch.

lezcano avatar lezcano commented on August 23, 2024

Hi, sorry, I have not had time to look into this today.
I just had a look, and it turns out that what's happening is that every time that you put the constraint, the constraint is initialised. Therefore the randomness. If you put an orthogonal constraint on a linear layer nn.Linear(3,3), the default weight from this layer is not orthogonal, and it is not clear which orthogonal weight it should be initialised to. As such, when you do geotorch.orthogonal(layer, "weight"), it initialises layer.weight to a randomly sampled orthogonal matrix. See:
https://geotorch.readthedocs.io/en/latest/orthogonal/stiefel.html#geotorch.Stiefel.sample
The sampling happens here (see line 22)
https://github.com/Lezcano/geotorch/blob/55a91ca973ddeeb12d193d464af4b23da93d5ab4/geotorch/constraints.py#L18-L23
I hope this helps you understanding a bit better the inner workings of GeoTorch! I think I should make this a bit more explicit somewhere.

If you do not like this behaviour, you can always initialise your layer to whichever orthogonal matrix you like by doing

layer = nn.Linear(3, 3)
geotorch.orthogonal(layer, "weight")
X = my_sampling_method(3, 3)
layer.weight = X

About the Cayley vs Exponential, the exponential map, as it is implemented in PyTorch, is more efficient than the Cayley map. That being said, I'd suggest you time them for the sizes that you are going to use with timit on a Jupyter notebook, just to be on the safe side.

from geotorch.

lezcano avatar lezcano commented on August 23, 2024

As a separate minor point, I would recommend you to use geotorch's methods over implementing them yourself, as they have a number of small optimisations and little quirks that you can benefit from at zero cost. It also has the advantage that it's more explicit with the intentions of what you want to do, but that's another story :)

from geotorch.

JonathanSchmidt1 avatar JonathanSchmidt1 commented on August 23, 2024

Thank you for your help, I thought that it was a problem like this with geotorch but I did not see how to fix it, now I should be able to bring it back into the code

from geotorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.