Git Product home page Git Product logo

owe's People

Contributors

haseebs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

owe's Issues

Data for head prediction

Hi there,

I noticed that the dataset you provided (FB15k-237-OWE) represents the open-world tail prediction data you described in your paper. However, is there a way to construct also the head prediction dataset ?

Thanks

Process killed during dataload

During data loading for a large KG (900k+) triples in data.py

def init_labels(self) -> None:
#other code
self.head_labels = {k: to_one_hot(t, self.train.num_entities) for k, t in all_heads.items()}
self.tail_labels = {k: to_one_hot(t, self.train.num_entities) for k, t in all_tails.items()}

I have around 1M heads and tails items combined. My guess is it getting targeted by the OOM Killer. Any ideas why?

training KGC models using Open-KE

Hi,

I wonder if I can train the KGC model using only the code in the folder "owe" that you gave if I don't want to use the open-KE framework. Or what is the relationship between the two folders that you gave. I'm sorry I'm new to this, so there may be soem questions.

Thanks

Regarding training OpenKE with bigger graphs

This is technically not part of your contribution but I was wondering if you could let me know about how to train embeddings on bigger graphs. I am using a GPU with 32 GB GPU and it is exceeding it's capacity.

WARNING: Config setting: 'cuda' not found

WARNING: Config setting: 'cuda' not found. Returning None!

If I get this correctly this should be thrown due to not finding a configuration in the config.ini file. Would this affect speed of training

FB20k dataset

Hi, thanks for sharing this wonderful work!

Can you share the processed FB20K dataset?

AttributeError: 'NoneType' object has no attribute 'endswith'

This problem occurs when I run 'run_open_world.py'. And the first thing that went wrong was the following line of code in the 'data.py' file.
"return KeyedVectors.load_word2vec_format(embedding_file, binary=not embedding_file.endswith(".txt")) "
I woule like to know what this line of code means and what the endswith function means. Thank you!

about the dataset

Hi, Can you share the Test set and Valid set of Head Pred. of FB15k-237-OWE, as you wrote in the Table2 in your paper? In addition, can you provide the openke_format dataset of FB20k? I can't find it. Thanks so much!

About <PATH_TO_KGC_EMBEDDINGS>

Hello,
In this, python owe/run_open_world.py -t -c <PATH_TO_DATASET> -d <PATH_TO_CONFIG_AND_OUTPUT_DIR> --<KGC_MODEL_NAME> <PATH_TO_KGC_EMBEDDINGS>,What does <PATH_TO_KGC_EMBEDDINGS> refer to?
I wonder what kind of file it is? And how should this file be obtained?

Some help reproducing results of FB20k & DBPedia50k

Hi,
I'm able to reproduce the FB15k-237-OWE results reported in the paper. I'm unable to reproduce FB20k and DBpedia50k results.

  1. Can you tell me which split you used? I'm using the splits of DKRL and Conmask respectively. For FB20k do you use the same descriptions as DKRL or shorter wikidata descriptions ?
  2. Do I need to change any of the hyperparamters ? In the paper, only dropout is mentioned
  3. The descriptions of FB20k and DBpedia50k are quite long. Do you still just use an average encoder? Any filtering before averaging ?

unsupported operand type

Hi there,
I found that when I loaded pretrained embeddings from Complex model , there is a mistake that "unsupported operand type(s) for / : 'str' and 'str' "
And the corresponding code is "entity_r_file = emb_dir / "entities_r.p" "
Hope you can help me.

Question about sampling strategy for FB15k-237-OWE

Hi Haseeb,

I have a question regarding your sampling strategy to produce FB15k-237-OWE.

Each picked head x is removed from the training graph by moving all triples of the form (x, ?, t) to the test set and dropping all triples of the form (?, ?, x) if t still remains in the training set after these operations.

For dropping all triples of the form (?, ?, x) if t still remains. What does t refer and it seems (?, ?, x) not contains t?

How to deal with IndexError: list index out of range

This is what I am getting as error:

Traceback (most recent call last):
File "run_closed_world.py", line 115, in
main()
File "run_closed_world.py", line 100, in main
model.init_embeddings(dataset, model_dir)
File "/......./OWE-master/owe/models/closed_world/transe.py", line 56, in init_embeddings
for l in entity2id_file.open("rt")}
File "/....../OWE-master/owe/models/closed_world/transe.py", line 56, in
for l in entity2id_file.open("rt")}
IndexError: list index out of range

entity2wikidata.json file queries

Hi,
I was wondering about the entity2wikidata.json file. If I were to use wikidata as the Knowledge Graph, how would this file look like? Also, are there any scripts that can be used to generate this file?

Dataset link broken

It seems like the links to download the datasets and the per-trained embedding are broken, can you fix this? Thank you

Predicting tails

The returned matrix for tail prediction is of the [1,E,D]; where E is the number of distinct entities in the training set and D is the number of dimensions, does that mean the model expects known entities as tails in the test set??

can you give me an example?

Thank you for your reply.

I am running your code. But it's not going well.
I downloaded the data set, could you give me an example?
For example :
Training:
python3 owe/run_open_world.py -t -c ./data/FB15k-237-OWE -d ./ --complex ./data/FB15k_embedding

evaluate:
python3 owe/run_open_world.py -e -lb -c ./data/preprocessed -d ./ --complex ./data/FB15k_embedding

Miao

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

Hi,haseebs:
I downloaded enwiki _ 20180420_ 300d. pkl.bz2 on Wiki, whether the preconditionedembeddingfile =. / data / enwiki_ 20180420_ 300d. pkl.bz2 Or =. / data / enwiki_ 20180420_ 300d.pkl。 When running the code, I encountered the following problems. How can I solve them?

(venv) gzdx@2080:/home/OLD/rhl/OWE$ python owe/run_open_world.py -t -c ./data/FB15k-237-zeroshot -d /home/OLD/rhl/OWE/ --complex ./embeddings/
19:20:05: INFO: Git Hash: 7735eb5

19:20:05: INFO: Reading config from: /home/OLD/rhl/OWE/config.ini
19:20:05: INFO: Using entity2wikidata.json as wikidata file
data/FB15k-237-zeroshot/train.txt
data/FB15k-237-zeroshot/valid_zero.txt
data/FB15k-237-zeroshot/train.txt
data/FB15k-237-zeroshot/valid_zero.txt
19:20:05: INFO: 12324 distinct entities in train having 235 relations (242489 triples).
19:20:05: INFO: 6038 distinct entities in validation having 220 relations (9424 triples).
19:20:05: INFO: 8897 distinct entities in test having 224 relations (22393 triples).
19:20:05: INFO: Working with: 14405 distinct entities having 235 relations.
19:20:05: INFO: Converting entities...
19:20:06: INFO: Building Vocab...
19:20:06: INFO: Building triples...
19:20:17: INFO: Loading word vectors from: ./data/enwiki_20180420_300d.pkl.bz2...
Traceback (most recent call last):
File "owe/run_open_world.py", line 164, in
main()
File "owe/run_open_world.py", line 99, in main
word_vectors = data.load_embedding_file(Config.get('PretrainedEmbeddingFile'))
File "/home/OLD/rhl/OWE/owe/data.py", line 67, in load_embedding_file
return KeyedVectors.load_word2vec_format(embedding_file, binary=not embedding_file.endswith(".txt"))
File "/home/OLD/rhl/venv/lib/python3.8/site-packages/gensim/models/keyedvectors.py", line 1547, in load_word2vec_format
return _load_word2vec_format(
File "/home/OLD/rhl/venv/lib/python3.8/site-packages/gensim/models/utils_any2vec.py", line 276, in _load_word2vec_format
header = utils.to_unicode(fin.readline(), encoding=encoding)
File "/home/OLD/rhl/venv/lib/python3.8/site-packages/gensim/utils.py", line 368, in any2unicode
return unicode(text, encoding, errors=errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.