Git Product home page Git Product logo

Comments (15)

zhanqiuzhang avatar zhanqiuzhang commented on July 20, 2024

Hi @Wentao-Xu ,

  1. We use the formulation Re(\overline{h}Rt^\top) for notation convenience in the whole paper. Actually, we can implement ComplEx using both formulations. The key of ComplEx is the antisymmetric score functions instead of the place of conjugation. Since both the real and the imaginary parts of embeddings are learnable parameters, implementations with these two formulations will share the same performance.

  2. It is the definition of dot products in complex spaces (see wikipedia). We unify the formulations of tensor factorization based KGC models using a dot product between two complex vectors, while the authors of ComplEx use a component-wise multilinear dot product. The two formulations are the same when the relational matrices are diagonal and complex.

Thanks

from kge-dura.

Wentao-Xu avatar Wentao-Xu commented on July 20, 2024

Thanks for your response,

  1. I have read your code, and your code uses the scoring function Re(<h, r, \overline{t}>), but equation 2 Re(\overline{h}Rt^\top) are not corresponding to your code although they share the same performance.
    (Re(h) + Im(h) i) (Re(r) + Im(r) i) (Re(t) - Im(h) i) and (Re(h) - Im(h) i) (Re(r) + Im(r) i) (Re(t) + Im(h) i) are different.
    Probably Re(hR\overline{t}^\top) is more accurate?

  2. Thanks for pointing out the <u, v> represents the inner products of two complex vectors. But I still do not understand what the h\overline{r} mean in Re(<h\overline{r}, t>), do you mean the h\overline{r} is the dot product between h and \overline{r} ? or h\overline{r} = (Re(h) + Im(h) i) (Re(r) + Im(r) i) ?

Looking forward to your response again.

from kge-dura.

zhanqiuzhang avatar zhanqiuzhang commented on July 20, 2024

Hi,

  1. You can just think that we parameterize the negative imaginary parts of entity embeddings. Then the score function will be the same as that in our code.

  2. $h\overline{R}$ is the multiplication between a complex vector and a complex matrix, and the result is a complex vector. When $R$ is diagonal, it is equivalent to the element-wise product between $h$ and the $r$, when $r$ is a vector consisting of the diagonal elements of $R$.

Thanks

from kge-dura.

Wentao-Xu avatar Wentao-Xu commented on July 20, 2024

Thanks for your detailed response, I probably understand what you mean. \overline{h} is [h_0, -h_1 i], R is [[r_0, 0],[0, r_1 i]], and t is \overline{h} is [t_0, t_1 i].
And do you mean <h\overline{R}, t> = <(h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i, t_0 + t_1 i>?

To be honest, it is really hard to understand, Maybe there should be more introduction in the paper (e.g., more details in Section 2 Preliminaries)?
Your code is not directly corresponding to your paper (although they doing the same thing). In your paper, the matrix representation R_j of the relation r_j is a matrix, but the representation of relation r_j in your code is a vector.

from kge-dura.

zhanqiuzhang avatar zhanqiuzhang commented on July 20, 2024
  1. Yes, <h\overline{R}, t> = <(h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i, t_0 + t_1 i>.
  2. We unify the formulations of RESCAL/CP/ComplEx with relational matrices. We did not include the basic definitions of matrix operations due to the space limit of NeurIPS. Nonetheless, thanks for your suggestions. We will consider to add more introduction in the next version of our paper.

from kge-dura.

Wentao-Xu avatar Wentao-Xu commented on July 20, 2024

Yes, I think h\overline{R} is not an ordinary multiplication if h is a complex vector and \overline{R} is a diagonal matrix.

But I still not understand why h\overline{R} = [h_0, h_1 i] [[r_0, 0],[0, -r_1 i]] = h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i?
Could you provide more details about the multiplication you defined?

from kge-dura.

zhanqiuzhang avatar zhanqiuzhang commented on July 20, 2024
  1. The matrix R is not just diagonal, but also complex. Each diagonal element of R is a complex number.
  2. The operations in our paper are ordinary matrix multiplications in the complex space. Thus, if you let the embedding of h be h_0+h_1 i, then the embedding dimension is 1 in the complex space. Correspondingly, R should be a 1x1 matrix [[r_0+r_1 i]], instead of [[r_0, 0],[0, -r_1 i]].

from kge-dura.

Wentao-Xu avatar Wentao-Xu commented on July 20, 2024

ok, thanks for your reply.
h is h_0+h_1 i, and R is [[r_0+r_1 i]], the conjugate matrix \overline{R} is [[r_0 - r_1 i]], so:
h \overline{R} = [h_0+h_1 i] [[r_0- r_1 i]] = h_0r_0 + h_1r_1 + (h_1r_0 - h_0r_1) i, but this is not the (h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i we want.
Did I understand something wrong again?

from kge-dura.

zhanqiuzhang avatar zhanqiuzhang commented on July 20, 2024

Yes, h \overline{R} = [h_0+h_1 i] [[r_0- r_1 i]] = h_0r_0 + h_1r_1 + (h_1r_0 - h_0r_1) i. It leads to an equivalent formulation of ComplEx, as we what we have discussed before.

In our paper, the dot product between two complex vectors u and v are <u, v>=\overline{u} t^\top (see Equation 2). Thus, when taking the dot product between h\overline{R} and t, h\overline{R} actually works as h_0r_0 + h_1r_1 + (-h_1r_0 + h_0r_1) i.

I have mentioned that, you can just think that we parameterize the negative imaginary parts (-h_1) of entity embeddings. In this way, to implement h_0r_0 + h_1r_1 + (-h_1r_0 + h_0r_1) i, the code will be (h_0r_0 - h_1r_1) + (h_0r_1 + h_1r_0) i.

from kge-dura.

Wentao-Xu avatar Wentao-Xu commented on July 20, 2024

ok, I totally understand.
The notation h_1 in your paper is not corresponding to vector h_1 in your code, but corresponding to the vector (-h1) in your code.
The other problem is if you parameterize the negative imaginary parts (-h_1) of entity embeddings, since the tail entity t shares the same embedding as head entity h, do you also parameterize the negative imaginary parts (-t_1) of entity embeddings?

from kge-dura.

Wentao-Xu avatar Wentao-Xu commented on July 20, 2024

That is, given a 4000 dimension complex vector [e_0, e_1] of the embedding h or t. (h and t are the same entity (e.g., lion) but in different positions). The real embedding for the head entity h is [e_0, -e_1], but the tail entity's embedding t should also be the [e_0, -e_1] since h and t are the same entity.

from kge-dura.

zhanqiuzhang avatar zhanqiuzhang commented on July 20, 2024

Yes. That's why there is no conjunction for t as that in the original ComplEx paper.

from kge-dura.

Wentao-Xu avatar Wentao-Xu commented on July 20, 2024

But why do you do this transformation? Why do you not make the notation in your paper correspond to the code?
This transformation makes the paper harder to understand, and I can not understand if you do not provide such a detailed explanation.
哈哈哈,我真的被你绕晕了,paper里面都没有讲虚部的参数都取了个负号,搞到我在纸上推了好久都没推出你paper里公式的结果。

from kge-dura.

zhanqiuzhang avatar zhanqiuzhang commented on July 20, 2024

I have also mentioned that, we use the formulation Re(\overline{h}Rt^\top) for notation convenience in the whole paper : ). Moreover, the notations in our paper are self-consistent and equivalent to the implementations in our code.

2333,这种实现上的细节写在 paper 里会有更多人看不懂吧。

from kge-dura.

Wentao-Xu avatar Wentao-Xu commented on July 20, 2024

All right, but my first impression of this paper is why Equation 2 is different from the scoring function of ComplEx in ICML 2016 or ICML 2018, and the reason is you parameterize the negative imaginary parts (-h_1) or (-t_1) of entity embedding.

In a word, I think more clarifications are definitely necessary.

from kge-dura.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.