Git Product home page Git Product logo

hyperim's People

Contributors

bcol23 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

hyperim's Issues

Do the last FC layers get out of the Hyperbolic space?

The network architecture begins with a few hyperbolic layers but ends with Euclidean layers as states in:

self.dense_1 = nn.Linear(feature_num, int(feature_num/2))

The full architecture was:

    def __init__(self, feature_num, word_embed, label_embed, hidden_size=5, if_gru=True, 
                 default_dtype=th.float64, **kwargs):
        super().__init__(**kwargs)
        
        self.hidden_size = hidden_size
        
        self.word_embed = gt.ManifoldParameter(word_embed, manifold=gt.PoincareBall())
        self.label_embed = gt.ManifoldParameter(label_embed, manifold=gt.PoincareBall())
        self.default_dtype = default_dtype
        
        if(if_gru):
            self.rnn = hyperGRU(input_size=word_embed.shape[1], hidden_size=self.hidden_size, 
                                default_dtype=self.default_dtype)
        else:
            self.rnn = hyperRNN(input_size=word_embed.shape[1], hidden_size=self.hidden_size, 
                                default_dtype=self.default_dtype)
        
        self.dense_1 = nn.Linear(feature_num, int(feature_num/2))
        self.dense_2 = nn.Linear(int(feature_num/2), 1)

How can we optimize with Riemann SGD if not all the parameters are on hyperbolic space?

Article's score

Hi @bcol23
Thank you very much for the inspiring work and publishing your code. I found it very interesting.
I played around with the repo and was discouraged a little bit - I can not achieve score you published in the article. Could you please provide a script to reproduce the published results for RCV1 dataset.
Also I would suggest a few improvement of the project's structure if you don't mind.

Geoopt update

FYI, the code should be updated to keep in sync with
geoopt/geoopt#77
Write here if any questions

We are in active development stage, but sorry for troubles

Reproduce RCV1 score

Hey @bcol23 ,

Thanks for the paper and the code.
I adopted following steps to reproduce your numbers :

  1. Pre-train the 300d word embeddings from poincare_glove with the following command
    ./run_glove.sh --train --coocc_file ../poincare_glove2/GloVe/cooccurrence.bin --vocab_file ../poincare_glove2/GloVe/vocab.txt --epochs 50 --workers 100 --restrict_vocab 200000 --lr 0.01 --bias --dist_func cosh-dist-sq --root .. --chunksize 1000 --poincare 1 --no_eval --mix --num_embs 50 --size 300
    I couldn't understand the initialization part mentioned in Issue #1.

  2. Pre-train the 10 dim label embeddings from poincare embeddings.

  3. I replaced the word and label embeddings in HyperIM.py file and ran the model to recieve the following scores for RCV1.
    P@1 55.369 P@3 36.741 P@5 28.032
    r@1 44.631 r@3 63.259 r@5 71.968
    MicroF1@1 45.817 MicroF1@3 45.845 MicroF1@5 39.947
    nDCG@1 55.369 nDCG@3 44.331 nDCG@5 46.696

These numbers are far less than reported in the paper.
Could you tell what additional needs to be done to achieve scores mentioned in your paper

Thanks,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.