pgcool / joint-bootstrapping-machines Goto Github PK
View Code? Open in Web Editor NEWJoint Bootstrapping Machines for High Confidence Relation Extraction : NAACL-HLT 2018 Long Paper.
Joint Bootstrapping Machines for High Confidence Relation Extraction : NAACL-HLT 2018 Long Paper.
python2 large_scale_evaluation_freebase.py 0.5 ../../data/output/BREE/REL_ACQUIRED_ORG_ORG/relationships_baseline.txt acquired ../../resources/freebase-easy-14-04-14/freebase_facts.txt ../../data ../../data/input/sentences.txt ./index_full
Relationships score threshold : 0.5
System output relationships : 2687
loading the saved Freebase databases.....
Relationship Type: acquired
Arg1 Type: ORG
Arg2 Type: ORG
Calculating set B: intersection between system output and database
<Process(Process-3, started)> In Queue 2187
<Process(Process-3, started)> In Queue 1687
<Process(Process-3, started)> In Queue 1187
<Process(Process-3, started)> In Queue 687
<Process(Process-3, started)> In Queue 187
Time taken: 0.49 seconds
System output : 2687
Found in database : 0
Not found : 2687
Calculating set A: correct facts from system output not in the database (proximity PMI)
Time taken: 1.05 seconds
System output : 2687
Found in database : 0
Correct in corpus : 1146
Not found : 1541
Calculating set C: database facts in the corpus but not extracted by the system
Building G', a superset of G
Loading superset G' superset_ORG_ORG.pkl
Estimating G intersection with D
G': 0
Database: 0
Extra filtering: from the intersection of G' with D, select only those based on keywords
0
0 relationships in the corpus which are in the KB
Extra filtering: from the G' not in D, select only those based on keywords
0 relationships in the corpus not in the KB
Traceback (most recent call last):
File "large_scale_evaluation_freebase.py", line 1284, in
main()
File "large_scale_evaluation_freebase.py", line 1202, in main
rel_words_bigrams, system_output_dir=system_output_dir)
File "large_scale_evaluation_freebase.py", line 159, in wrapper
result = f(*args, **kw)
File "large_scale_evaluation_freebase.py", line 629, in calculate_c
assert len(g_minus_d) > 0
AssertionError
/Joint-Bootstrapping-Machines/code/automatic_evaluation$ python2 index_whoosh.py ../../data/input/sentences.txt index_dir
Traceback (most recent call last):
File "index_whoosh.py", line 96, in
main()
File "index_whoosh.py", line 91, in main
index_sentences(writer)
File "index_whoosh.py", line 32, in wrapper
result = f(*args, **kw)
File "index_whoosh.py", line 60, in index_sentences
s = Sentence(l, '', '', max_tokens, min_tokens, context_window)
File "/home/oak/BASIC/bootstrapping/Joint-Bootstrapping-Machines/code/automatic_evaluation/Sentence.py", line 144, in init
text_tokens = word_tokenize(sentence_no_tags.decode('utf-8'))
File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe6' in position 107: ordinal not in range(128)
Traceback (most recent call last):
File "large_scale_evaluation_freebase.py", line 1284, in
main()
File "large_scale_evaluation_freebase.py", line 1071, in main
threshold = float(sys.argv[1])
ValueError: could not convert string to float: threshold
ub16hp@UB16HP:~/ub16_prj/Joint-Bootstrapping-Machines/code/automatic_evaluation$ python large_scale_evaluation_freebase.py 0.5 ../../data/output/BREE/REL_ACQUIRED_ORG_ORG/relationships_baseline.txt acquired ../../resources/freebase-easy-14-04-14/freebase_facts.txt ../../data ../../data/input/sentences.txt ./index_dir
Relationships score threshold : 0.5
System output relationships : 2687
loading the saved Freebase databases.....
Relationship Type: acquired
Arg1 Type: ORG
Arg2 Type: ORG
Calculating set B: intersection between system output and database
<Process(Process-3, started)> In Queue 2187
<Process(Process-3, started)> In Queue 1687
<Process(Process-3, started)> In Queue 1187
<Process(Process-3, started)> In Queue 687
<Process(Process-3, started)> In Queue 187
Time taken: 0.43 seconds
System output : 2687
Found in database : 0
Not found : 2687
Calculating set A: correct facts from system output not in the database (proximity PMI)
Process Process-5:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "large_scale_evaluation_freebase.py", line 911, in proximity_pmi_a
idx = open_dir(index)
File "/usr/local/lib/python2.7/dist-packages/whoosh/index.py", line 123, in open_dir
return FileIndex(storage, schema=schema, indexname=indexname)
File "/usr/local/lib/python2.7/dist-packages/whoosh/index.py", line 421, in init
TOC.read(self.storage, self.indexname, schema=self._schema)
File "/usr/local/lib/python2.7/dist-packages/whoosh/index.py", line 616, in read
gen = cls._latest_generation(storage, indexname)
File "/usr/local/lib/python2.7/dist-packages/whoosh/index.py", line 593, in _latest_generation
for filename in storage:
File "/usr/local/lib/python2.7/dist-packages/whoosh/filedb/filestore.py", line 81, in iter
return iter(self.list())
File "/usr/local/lib/python2.7/dist-packages/whoosh/filedb/filestore.py", line 525, in list
files = os.listdir(self.folder)
OSError: [Errno 2] No such file or directory: './index_full'
ub16hp@UB16HP:~/ub16_prj/Joint-Bootstrapping-Machines/code/automatic_evaluation$ mkdir index_full
ub16hp@UB16HP:~/ub16_prj/Joint-Bootstrapping-Machines/code/automatic_evaluation$ python large_scale_evaluation_freebase.py 0.5 ../../data/output/BREE/REL_ACQUIRED_ORG_ORG/relationships_baseline.txt acquired ../../resources/freebase-easy-14-04-14/freebase_facts.txt ../../data ../../data/input/sentences.txt ./index_dir
Relationships score threshold : 0.5
System output relationships : 2687
loading the saved Freebase databases.....
Relationship Type: acquired
Arg1 Type: ORG
Arg2 Type: ORG
Calculating set B: intersection between system output and database
<Process(Process-3, started)> In Queue 2187
<Process(Process-3, started)> In Queue 1687
<Process(Process-3, started)> In Queue 1187
<Process(Process-3, started)> In Queue 687
<Process(Process-3, started)> In Queue 187
Time taken: 0.43 seconds
System output : 2687
Found in database : 0
Not found : 2687
Calculating set A: correct facts from system output not in the database (proximity PMI)
Process Process-5:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "large_scale_evaluation_freebase.py", line 911, in proximity_pmi_a
idx = open_dir(index)
File "/usr/local/lib/python2.7/dist-packages/whoosh/index.py", line 123, in open_dir
return FileIndex(storage, schema=schema, indexname=indexname)
File "/usr/local/lib/python2.7/dist-packages/whoosh/index.py", line 421, in init
TOC.read(self.storage, self.indexname, schema=self._schema)
File "/usr/local/lib/python2.7/dist-packages/whoosh/index.py", line 619, in read
% (indexname, storage))
EmptyIndexError: Index 'MAIN' does not exist in FileStorage('./index_full')
Time taken: 0.10 seconds
System output : 2687
Found in database : 0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.