bcol23 / hyperim Goto Github PK

View Code? Open in Web Editor NEW

48.0 48.0 12.0 645 KB

PyTorch implementation of the paper "Hyperbolic Interaction Model For Hierarchical Multi-Label Classification"

License: MIT License

Jupyter Notebook 27.66% Python 72.34%

text-classification

hyperim's People

Contributors

Stargazers

Watchers

Forkers

physci rajeshkumargp dragon9001 nightzero123 ccfbupt youhebuke skyfishmoon chaoso zlj2015106 2251821381 ahmednr65 koba-84

hyperim's Issues

Geoopt update

FYI, the code should be updated to keep in sync with
geoopt/geoopt#77
Write here if any questions

We are in active development stage, but sorry for troubles

How many epochs should the gensim.models.poincare.PoincareModel be trained?

Hello,

thanks for your shared code.
I'm wondering about the details of generating Poincare embeddings for the labels.
The PoincareModel can be trained, but I am not sure the acc of its training result.

Do the last FC layers get out of the Hyperbolic space?

The network architecture begins with a few hyperbolic layers but ends with Euclidean layers as states in:

HyperIM/net/HyperIM.py

Line 32 in c257d1c

self.dense_1 = nn.Linear(feature_num, int(feature_num/2))

The full architecture was:

    def __init__(self, feature_num, word_embed, label_embed, hidden_size=5, if_gru=True, 
                 default_dtype=th.float64, **kwargs):
        super().__init__(**kwargs)
        
        self.hidden_size = hidden_size
        
        self.word_embed = gt.ManifoldParameter(word_embed, manifold=gt.PoincareBall())
        self.label_embed = gt.ManifoldParameter(label_embed, manifold=gt.PoincareBall())
        self.default_dtype = default_dtype
        
        if(if_gru):
            self.rnn = hyperGRU(input_size=word_embed.shape[1], hidden_size=self.hidden_size, 
                                default_dtype=self.default_dtype)
        else:
            self.rnn = hyperRNN(input_size=word_embed.shape[1], hidden_size=self.hidden_size, 
                                default_dtype=self.default_dtype)
        
        self.dense_1 = nn.Linear(feature_num, int(feature_num/2))
        self.dense_2 = nn.Linear(int(feature_num/2), 1)

How can we optimize with Riemann SGD if not all the parameters are on hyperbolic space?

Hi @bcol23
Thank you very much for the inspiring work and publishing your code. I found it very interesting.
I played around with the repo and was discouraged a little bit - I can not achieve score you published in the article. Could you please provide a script to reproduce the published results for RCV1 dataset.
Also I would suggest a few improvement of the project's structure if you don't mind.

Reproduce RCV1 score

Hey @bcol23 ,

Thanks for the paper and the code.
I adopted following steps to reproduce your numbers :

Pre-train the 300d word embeddings from poincare_glove with the following command
./run_glove.sh --train --coocc_file ../poincare_glove2/GloVe/cooccurrence.bin --vocab_file ../poincare_glove2/GloVe/vocab.txt --epochs 50 --workers 100 --restrict_vocab 200000 --lr 0.01 --bias --dist_func cosh-dist-sq --root .. --chunksize 1000 --poincare 1 --no_eval --mix --num_embs 50 --size 300
I couldn't understand the initialization part mentioned in Issue #1.
Pre-train the 10 dim label embeddings from poincare embeddings.
I replaced the word and label embeddings in HyperIM.py file and ran the model to recieve the following scores for RCV1.
P@1 55.369 P@3 36.741 P@5 28.032
r@1 44.631 r@3 63.259 r@5 71.968
MicroF1@1 45.817 MicroF1@3 45.845 MicroF1@5 39.947
nDCG@1 55.369 nDCG@3 44.331 nDCG@5 46.696

These numbers are far less than reported in the paper.
Could you tell what additional needs to be done to achieve scores mentioned in your paper

Thanks,

problems with dataset

Hi @bcol23 ,
I've met some troubles when I try to reshow the score. I don't know which one is needed for RCV1 as there are so many files.

And I don't find label data as y_test and y_train for both WikiLSHTC and RCV1, the WikiLSHTC only have train.txt and test.txt which might be x_train and x_test.

Besides, I could not find dataset "Zhihu" while the link url could not be opened.

Hyperparameter settings for WikiLSHTC experiments

Hi,

Thank you for your work. We are looking to replicate your numbers on the Wiki-LSHTC dataset. Can you post the hyperparameters needed to replicate your experiments?

vocab_size and word_num's difference?

x_train.txt should be (instance_num, word_num)

I thought 'word_num' is the number of words which appearance in the RCV1. But there is another parameter 'vocab_size'. And every document have different words. How do I get x_train with the same 'word_num'?

I don't know if this is correct that x_train.txt should be obtained by calculating the frequency of words in each instance. If the 20,000th word in the vocabulary appears 10 times, then 10 should be filled in the corresponding position.

bcol23 / hyperim Goto Github PK

hyperim's People

Contributors

Stargazers

Watchers

Forkers

hyperim's Issues

Geoopt update

How many epochs should the gensim.models.poincare.PoincareModel be trained?

Do the last FC layers get out of the Hyperbolic space?

Article's score

Reproduce RCV1 score

problems with dataset

Hyperparameter settings for WikiLSHTC experiments

vocab_size and word_num's difference?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent