Some question about the Equation 2 about kge-dura HOT 15 CLOSED

miralab-ustc commented on July 20, 2024

Some question about the Equation 2

from kge-dura.

Comments (15)

zhanqiuzhang commented on July 20, 2024

Hi @Wentao-Xu ,

We use the formulation Re(\overline{h}Rt^\top) for notation convenience in the whole paper. Actually, we can implement ComplEx using both formulations. The key of ComplEx is the antisymmetric score functions instead of the place of conjugation. Since both the real and the imaginary parts of embeddings are learnable parameters, implementations with these two formulations will share the same performance.
It is the definition of dot products in complex spaces (see wikipedia). We unify the formulations of tensor factorization based KGC models using a dot product between two complex vectors, while the authors of ComplEx use a component-wise multilinear dot product. The two formulations are the same when the relational matrices are diagonal and complex.

Thanks

from kge-dura.

Wentao-Xu commented on July 20, 2024

Thanks for your response,

I have read your code, and your code uses the scoring function Re(<h, r, \overline{t}>), but equation 2 Re(\overline{h}Rt^\top) are not corresponding to your code although they share the same performance.
(Re(h) + Im(h) i) (Re(r) + Im(r) i) (Re(t) - Im(h) i) and (Re(h) - Im(h) i) (Re(r) + Im(r) i) (Re(t) + Im(h) i) are different.
Probably Re(hR\overline{t}^\top) is more accurate？
Thanks for pointing out the <u, v> represents the inner products of two complex vectors. But I still do not understand what the h\overline{r} mean in Re(<h\overline{r}, t>), do you mean the h\overline{r} is the dot product between h and \overline{r} ? or h\overline{r} = (Re(h) + Im(h) i) (Re(r) + Im(r) i) ？

Looking forward to your response again.

from kge-dura.

zhanqiuzhang commented on July 20, 2024

Hi,

You can just think that we parameterize the negative imaginary parts of entity embeddings. Then the score function will be the same as that in our code.
$h\overline{R}$ is the multiplication between a complex vector and a complex matrix, and the result is a complex vector. When $R$ is diagonal, it is equivalent to the element-wise product between $h$ and the $r$, when $r$ is a vector consisting of the diagonal elements of $R$.

Thanks

from kge-dura.

Wentao-Xu commented on July 20, 2024

Thanks for your detailed response, I probably understand what you mean. \overline{h} is [h_0, -h_1 i], R is [[r_0, 0],[0, r_1 i]], and t is \overline{h} is [t_0, t_1 i].
And do you mean <h\overline{R}, t> = <(h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i, t_0 + t_1 i>?

To be honest, it is really hard to understand, Maybe there should be more introduction in the paper (e.g., more details in Section 2 Preliminaries)?
Your code is not directly corresponding to your paper (although they doing the same thing). In your paper, the matrix representation R_j of the relation r_j is a matrix, but the representation of relation r_j in your code is a vector.

from kge-dura.

zhanqiuzhang commented on July 20, 2024

Yes, <h\overline{R}, t> = <(h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i, t_0 + t_1 i>.
We unify the formulations of RESCAL/CP/ComplEx with relational matrices. We did not include the basic definitions of matrix operations due to the space limit of NeurIPS. Nonetheless, thanks for your suggestions. We will consider to add more introduction in the next version of our paper.

from kge-dura.

Wentao-Xu commented on July 20, 2024

Yes, I think h\overline{R} is not an ordinary multiplication if h is a complex vector and \overline{R} is a diagonal matrix.

But I still not understand why h\overline{R} = [h_0, h_1 i] [[r_0, 0],[0, -r_1 i]] = h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i?
Could you provide more details about the multiplication you defined?

from kge-dura.

zhanqiuzhang commented on July 20, 2024

The matrix R is not just diagonal, but also complex. Each diagonal element of R is a complex number.
The operations in our paper are ordinary matrix multiplications in the complex space. Thus, if you let the embedding of h be h_0+h_1 i, then the embedding dimension is 1 in the complex space. Correspondingly, R should be a 1x1 matrix [[r_0+r_1 i]], instead of [[r_0, 0],[0, -r_1 i]].

from kge-dura.

Wentao-Xu commented on July 20, 2024

ok, thanks for your reply.
h is h_0+h_1 i, and R is [[r_0+r_1 i]], the conjugate matrix \overline{R} is [[r_0 - r_1 i]], so:
h \overline{R} = [h_0+h_1 i] [[r_0- r_1 i]] = h_0r_0 + h_1r_1 + (h_1r_0 - h_0r_1) i, but this is not the (h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i we want.
Did I understand something wrong again?

from kge-dura.

zhanqiuzhang commented on July 20, 2024

Yes, h \overline{R} = [h_0+h_1 i] [[r_0- r_1 i]] = h_0r_0 + h_1r_1 + (h_1r_0 - h_0r_1) i. It leads to an equivalent formulation of ComplEx, as we what we have discussed before.

In our paper, the dot product between two complex vectors u and v are <u, v>=\overline{u} t^\top (see Equation 2). Thus, when taking the dot product between h\overline{R} and t, h\overline{R} actually works as h_0r_0 + h_1r_1 + (-h_1r_0 + h_0r_1) i.

I have mentioned that, you can just think that we parameterize the negative imaginary parts (-h_1) of entity embeddings. In this way, to implement h_0r_0 + h_1r_1 + (-h_1r_0 + h_0r_1) i, the code will be (h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i.

from kge-dura.

Wentao-Xu commented on July 20, 2024

ok, I totally understand.
The notation h_1 in your paper is not corresponding to vector h_1 in your code, but corresponding to the vector (-h1) in your code.
The other problem is if you parameterize the negative imaginary parts (-h_1) of entity embeddings, since the tail entity t shares the same embedding as head entity h, do you also parameterize the negative imaginary parts (-t_1) of entity embeddings?

from kge-dura.

Wentao-Xu commented on July 20, 2024

That is, given a 4000 dimension complex vector [e_0, e_1] of the embedding h or t. (h and t are the same entity (e.g., lion) but in different positions). The real embedding for the head entity h is [e_0, -e_1], but the tail entity's embedding t should also be the [e_0, -e_1] since h and t are the same entity.

from kge-dura.

zhanqiuzhang commented on July 20, 2024

Yes. That's why there is no conjunction for t as that in the original ComplEx paper.

from kge-dura.

Wentao-Xu commented on July 20, 2024

But why do you do this transformation? Why do you not make the notation in your paper correspond to the code?
This transformation makes the paper harder to understand, and I can not understand if you do not provide such a detailed explanation.
哈哈哈，我真的被你绕晕了，paper里面都没有讲虚部的参数都取了个负号，搞到我在纸上推了好久都没推出你paper里公式的结果。

from kge-dura.

zhanqiuzhang commented on July 20, 2024

I have also mentioned that, we use the formulation Re(\overline{h}Rt^\top) for notation convenience in the whole paper : ). Moreover, the notations in our paper are self-consistent and equivalent to the implementations in our code.

2333，这种实现上的细节写在 paper 里会有更多人看不懂吧。

from kge-dura.

Wentao-Xu commented on July 20, 2024

All right, but my first impression of this paper is why Equation 2 is different from the scoring function of ComplEx in ICML 2016 or ICML 2018, and the reason is you parameterize the negative imaginary parts (-h_1) or (-t_1) of entity embedding.

In a word, I think more clarifications are definitely necessary.

from kge-dura.

Some question about the Equation 2 about kge-dura HOT 15 CLOSED

Comments (15)

Related Issues (4)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent