Could you provide the setting of KG embedding modulo? i.e. how can I get E.py&R.py

KG embedding about embedkgqa HOT 9 CLOSED

Wangyinquan commented on July 17, 2024

KG embedding

from embedkgqa.

Comments (9)

apoorvumang commented on July 17, 2024 1

Oh got it, sorry for the misunderstanding.
Try the following command:

 CUDA_VISIBLE_DEVICES=3 python main.py --dataset MetaQA --num_iterations 500 --batch_size 256 \
                                       --lr 0.0005 --dr 1.0 --edim 200 --rdim 200 --input_dropout 0.2 \
                                       --hidden_dropout1 0.3 --hidden_dropout2 0.3 --label_smoothing 0.1 \
                                       --valid_steps 10 --model ComplEx \
                                       --loss_type BCE --do_batch_norm 0 --l3_reg 0.001

I got following results:

(Not yet converged, just showing midway result)

from embedkgqa.

apoorvumang commented on July 17, 2024

The train_embedding code is for training on large KG such as FreeBase. For MetaQA, we recommend you use code such as https://github.com/ibalazevic/TuckER to get the embeddings and save in dictionary format as E.npy and R.npy (this is how we did it). Also since TuckER code uses batch normalization, you need to save batch normalization parameters as well.

from embedkgqa.

Wangyinquan commented on July 17, 2024

Why the trained model from train_embedding is so weak when I set do_batch_norm to 0?
Best valid: [0.05677566687091254, 4651.813471502591, 0.10436713545521836, 0.06439674315321983, 0.03195164075993091]

It works well if i set do_batch_norm to 1.

from embedkgqa.

apoorvumang commented on July 17, 2024

This is something that took us a while to figure out as well. Apparently https://github.com/ibalazevic/TuckER uses batch normalization in their implementation, and batch normalization has some learned parameters, namely bias, running mean and running variance (https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html). If we are using batch normalization while training embeddings, we must use them in our model as well, because the scoring function changes . I believe if we use an embedding training implementation where batch norm is not used, we can get away with not using it.

(You may have seen in the code that we load some bn parameters as well, along with E.npy)

Here however it is necessary if we want to keep the same scoring function as was used while training embeddings (TuckER code)

from embedkgqa.

apoorvumang commented on July 17, 2024

You could also try by setting do_batch_norm to 1 and then freezing the batch norm parameters. This way, no real batch norm is done (parameters are fixed, don't depend on new data that is coming) but the code should work fairly well.

from embedkgqa.

Wangyinquan commented on July 17, 2024

If I set do_batch_norm to 0, the KGE module (train_embedding code) should be the same as original ComplEx, right?
This result is from KGE module rather than QA. I trained 500 epoches with do_batch_norm=1 and got this:
Best valid: [0.05677566687091254, 4651.813471502591, 0.10436713545521836, 0.06439674315321983, 0.03195164075993091]

from embedkgqa.

Wangyinquan commented on July 17, 2024

Thanks, it works well

from embedkgqa.

ShuangNYU commented on July 17, 2024

Oh got it, sorry for the misunderstanding.
Try the following command:

 CUDA_VISIBLE_DEVICES=3 python main.py --dataset MetaQA --num_iterations 500 --batch_size 256 \
                                       --lr 0.0005 --dr 1.0 --edim 200 --rdim 200 --input_dropout 0.2 \
                                       --hidden_dropout1 0.3 --hidden_dropout2 0.3 --label_smoothing 0.1 \
                                       --valid_steps 10 --model ComplEx \
                                       --loss_type BCE --do_batch_norm 0 --l3_reg 0.001

I got following results:

(Not yet converged, just showing midway result)

Hi! I got quite different results when setting model to be 'SimplE' but with datasets 'MetaQA' and 'MetaQA_half' respectively.

With 'MetaQA':
Best valid: [0.8958811624412727, 288.4449790278806, 0.9332593140883296, 0.918085368862571, 0.8696027633851469]
Best Test: [0.8883614038125973, 337.9813764183522, 0.9311790823877651, 0.9124321657622102, 0.8595214602861372]
Dataset: MetaQA
Model: SimplE

With 'MetaQA_half':
Best valid: [0.09818482275952602, 9011.535875, 0.1655, 0.107125, 0.06425]
Best Test: [0.0959381995657658, 8872.50025, 0.159125, 0.10625, 0.062125]
Dataset: MetaQA_half
Model: SimplE

The hyper-parameters are set the same. Have you ever come across this problem?

from embedkgqa.

maxinsi commented on July 17, 2024

Hello, may I ask what is the problem that after I use the hyperparameters you provide to learn, the MRR is only about 0.5
and it ranks high on average, but doesn't have a high top-10 shooting percentage
+--------------------+--------------------+
| Metric | Result |
+--------------------+--------------------+
| Hits@10 | 0.7760236803157375 |
| Hits@3 | 0.6213616181549088 |
| Hits@1 | 0.4037987173162309 |
| MeanRank | 95.99296990626542 |
| MeanReciprocalRank | 0.5335250755543975 |
+--------------------+--------------------+
+------------+-----------------------------------------------------------------------------------------------------+
| ARTIFACT | VALUE |
+------------+-----------------------------------------------------------------------------------------------------+
| Best valid | [0.537612634856118, 91.54453491241055, 0.7783123612139157, 0.6292869479397977, 0.4066123858869973] |
| Best test | [0.5332891002897144, 95.93339911198817, 0.7763936852491367, 0.620004933399112, 0.40281203749383326] |
| Dataset | MetaQA |
| Model | ComplEx |
+------------+-----------------------------------------------------------------------------------------------------+
Training-time: 8.57
+-----------------+--------+
| Parameter | Value |
+-----------------+--------+
| Learning rate | 0.0005 |
| Decay | 1.0 |
| Dim | 200 |
| Input drop | 0.2 |
| Hidden drop 2 | 0.3 |
| Label Smoothing | 0.1 |
| Batch size | 256 |
| Loss type | BCE |
| L3 reg | 0.001 |
+-----------------+--------+

from embedkgqa.

KG embedding about embedkgqa HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent