bcol23 / hyperim Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of the paper "Hyperbolic Interaction Model For Hierarchical Multi-Label Classification"
License: MIT License
PyTorch implementation of the paper "Hyperbolic Interaction Model For Hierarchical Multi-Label Classification"
License: MIT License
FYI, the code should be updated to keep in sync with
geoopt/geoopt#77
Write here if any questions
We are in active development stage, but sorry for troubles
Hello,
thanks for your shared code.
I'm wondering about the details of generating Poincare embeddings for the labels.
The PoincareModel can be trained, but I am not sure the acc of its training result.
The network architecture begins with a few hyperbolic layers but ends with Euclidean layers as states in:
Line 32 in c257d1c
The full architecture was:
def __init__(self, feature_num, word_embed, label_embed, hidden_size=5, if_gru=True,
default_dtype=th.float64, **kwargs):
super().__init__(**kwargs)
self.hidden_size = hidden_size
self.word_embed = gt.ManifoldParameter(word_embed, manifold=gt.PoincareBall())
self.label_embed = gt.ManifoldParameter(label_embed, manifold=gt.PoincareBall())
self.default_dtype = default_dtype
if(if_gru):
self.rnn = hyperGRU(input_size=word_embed.shape[1], hidden_size=self.hidden_size,
default_dtype=self.default_dtype)
else:
self.rnn = hyperRNN(input_size=word_embed.shape[1], hidden_size=self.hidden_size,
default_dtype=self.default_dtype)
self.dense_1 = nn.Linear(feature_num, int(feature_num/2))
self.dense_2 = nn.Linear(int(feature_num/2), 1)
How can we optimize with Riemann SGD if not all the parameters are on hyperbolic space?
Hi @bcol23
Thank you very much for the inspiring work and publishing your code. I found it very interesting.
I played around with the repo and was discouraged a little bit - I can not achieve score you published in the article. Could you please provide a script to reproduce the published results for RCV1 dataset.
Also I would suggest a few improvement of the project's structure if you don't mind.
Hey @bcol23 ,
Thanks for the paper and the code.
I adopted following steps to reproduce your numbers :
Pre-train the 300d word embeddings from poincare_glove with the following command
./run_glove.sh --train --coocc_file ../poincare_glove2/GloVe/cooccurrence.bin --vocab_file ../poincare_glove2/GloVe/vocab.txt --epochs 50 --workers 100 --restrict_vocab 200000 --lr 0.01 --bias --dist_func cosh-dist-sq --root .. --chunksize 1000 --poincare 1 --no_eval --mix --num_embs 50 --size 300
I couldn't understand the initialization part mentioned in Issue #1.
Pre-train the 10 dim label embeddings from poincare embeddings.
I replaced the word and label embeddings in HyperIM.py file and ran the model to recieve the following scores for RCV1.
P@1 55.369 P@3 36.741 P@5 28.032
r@1 44.631 r@3 63.259 r@5 71.968
MicroF1@1 45.817 MicroF1@3 45.845 MicroF1@5 39.947
nDCG@1 55.369 nDCG@3 44.331 nDCG@5 46.696
These numbers are far less than reported in the paper.
Could you tell what additional needs to be done to achieve scores mentioned in your paper
Thanks,
Hi @bcol23 ,
I've met some troubles when I try to reshow the score. I don't know which one is needed for RCV1 as there are so many files.
And I don't find label data as y_test and y_train for both WikiLSHTC and RCV1, the WikiLSHTC only have train.txt and test.txt which might be x_train and x_test.
Besides, I could not find dataset "Zhihu" while the link url could not be opened.
Hi,
Thank you for your work. We are looking to replicate your numbers on the Wiki-LSHTC dataset. Can you post the hyperparameters needed to replicate your experiments?
x_train.txt should be (instance_num, word_num)
I thought 'word_num' is the number of words which appearance in the RCV1. But there is another parameter 'vocab_size'. And every document have different words. How do I get x_train with the same 'word_num'?
I don't know if this is correct that x_train.txt should be obtained by calculating the frequency of words in each instance. If the 20,000th word in the vocabulary appears 10 times, then 10 should be filled in the corresponding position.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.