hazyresearch / hyphc Goto Github PK
View Code? Open in Web Editor NEWHyperbolic Hierarchical Clustering.
Hyperbolic Hierarchical Clustering.
The paper says that to decode the binary tree from the embedding, a top-down greedy approach is used.
However from the code, it seems that still a bottom up approach is used based on the angle similarity matrix, is that correct?
Say if I want to obtain a hierarchical clustering on the nodes given the final Poincare embeddings, is doing a single linkage agglomerative clustering using the angle similarity matrix between normalized embedding points equivalent to the decoding tree algorithm implemented in this code?
Hi!
Before doing more flexible learning, I'd like to learn hyperbolic embeddings for leaves of an existing tree (a tree from image region merging agglomerative clustering procedure). Such embeddings that when applying the tree decoding algorithm from HypHC, it would give back my original agglomerative clustering tree .
Can I do it within the framework / losses of HypHC?
I thought of using the tree shortest path for the similarity matrix.
The tricky part is that the similarity matrix obtained this way is extremely sparse, so randomly sampled triplets almost always have 0 similarities, so the model learns nothing.
Would you have any advices?
Thank you :)
Hi,
Could you please share examples of your parameter configurations on the mid- and large-scale datasets? For example, one on Segmentation and one on CIFAR-100.
Thank you!
Hi,
Thanks for the great work and well documented code.
Currently, it seems like the hyperbolic embedding lives in a Poincare disk/ball with curvature -1. In my own application, I would like to work with arbitrary curvature -c, c>0.
So my question is can I do so by just modifying utils/poincare.py
? Or I also need to make changes in optim
and elsewhere?
Thanks in advance for your valuable time.
Best,
Eli
Hi @ines-chami!
At https://github.com/HazyResearch/HypHC/blob/master/model/hyphc.py#L42 :
init_size=1e-3 # in config.py also "init_size": 1e-3
max_scale=1. - 1e-3 # in config.py also "max_scale": 1 - 1e-3
self.scale = nn.Parameter(torch.Tensor([init_size]), requires_grad=True)
min_scale = 1e-2 #self.init_size
max_scale = self.max_scale
return F.normalize(embeddings, p=2, dim=1) * self.scale.clamp_min(min_scale).clamp_max(max_scale)
So self.scale
(initialized always to init_size = 1e-3
) is always outside the clamp range (min_scale = 1e-2
and max_scale = 1 - 1e-3
), and so always receives zero gradient.
Is it expected / by design or was it some debug setting min_scale = 1e-2
which by mistake was not removed?
Hi, all,
After I run the scripts examples/run_zoo.sh
and example/run_glass.sh
, the trained model and log files are saved at ./embeddings as follows:
root@Lab-PC:/data/code14/HypHC/embeddings# tree .
.
├── glass
│ └── bc24ee553d3178d956bbb7a68d6058f5a48edb408c82a49e613a344d71602abf
│ ├── config.json
│ ├── model_0.pkl
│ └── train_0.log
└── zoo
└── ca286289368cbe66abdfe32406cffed3ff5e6102252c0fc6502519713db04b83
├── config.json
├── model_0.pkl
└── train_0.log
How can I visualize the hyperbolic embedding similar to demo file HypHC.gif
?
Thanks~
Thank you for sharing your fantastic work! I am just wondering do you have any plan to develop a sklearn-like interface.
For example:
alg = HypHC(*args)
alg.fit(X)
label = alg.predict(X)
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.