Comments (7)
Hi,
There, I call RESTful MediaWiki API https://en.wikipedia.org/w/api.php with following options:
- format = xml
- action = query
- list = random
- rnnamespace = 0 (built-in namespace numbers: you can check here.)
- rnlimit = 10 (this number is specified by the variable
limit
in my code.)
So, the URL looks like:
http://en.wikipedia.org/w/api.php?format=xml&action=query&list=random&rnnamespace=0&rnlimit=10
In this query, You'll get 10 random wikipedia article IDs (normal articles only, no category page or anything others) in xml format.
You can test the API options in sandbox.
Did I answer your question?
from cnn-re-tf.
Ah, this is a known problem. I'll fix it in future, but for now, could you please try this? Sorry for inconvenience.
If you still get any error, please let me know :)
from cnn-re-tf.
yes,very thank you!you answer my question perfectly!But I have another question,I can not run that program successfully.It will be wrong,and it shows that:
Traceback (most recent call last):
File "/home/wuyujuan/Desktop/cnn-re-tf-master1/distant_supervision.py", line 693, in
main()
File "/home/wuyujuan/Desktop/cnn-re-tf-master1/distant_supervision.py", line 682, in main
extract_positive()
File "/home/wuyujuan/Desktop/cnn-re-tf-master1/distant_supervision.py", line 570, in extract_positive
entities = util.load_from_dump(os.path.join(data_dir, "entities.cPickle"))
File "/home/wuyujuan/Desktop/cnn-re-tf-master1/util.py", line 285, in load_from_dump
with open(filename, 'rb') as infile:
IOError: [Errno 2] No such file or directory: '/home/wuyujuan/Desktop/cnn-re-tf-master1/data/entities.cPickle'
Can you help me solve this problem?very thank you!
from cnn-re-tf.
dear,
I'm sorry but I have tried this before and it does't work.
following are my results:
/home/wuyujuan/Envs/cnn_re/bin/python /home/wuyujuan/Desktop/cnn-re-tf-another_try/distant_supervision.py
===== step 1 =====
[1/4] Downloading wiki articles ...
Traceback (most recent call last):
File "/home/wuyujuan/Desktop/cnn-re-tf-another_try/distant_supervision.py", line 694, in
main()
File "/home/wuyujuan/Desktop/cnn-re-tf-another_try/distant_supervision.py", line 682, in main
positive_examples()
File "/home/wuyujuan/Desktop/cnn-re-tf-another_try/distant_supervision.py", line 453, in positive_examples
ret = loop(step, doc_id, limit, entities, relations, counter)
File "/home/wuyujuan/Desktop/cnn-re-tf-another_try/distant_supervision.py", line 344, in loop
docs = download_wiki_articles(doc_id, limit)
File "/home/wuyujuan/Desktop/cnn-re-tf-another_try/distant_supervision.py", line 73, in download_wiki_articles
pages = bs(r, "html.parser").findAll('page')
File "/home/wuyujuan/Envs/cnn_re/lib/python2.7/site-packages/bs4/init.py", line 192, in init
elif len(markup) <= 256 and (
TypeError: object of type 'NoneType' has no len()
Process finished with exit code 1
thank for your time,may!
from cnn-re-tf.
dear,
I got the following after several days' work:
[1/4] Downloading wiki articles ...
['16648-.txt' '17514-.txt' '24591-.txt' '28182-.txt' '28356-.txt'
'28852-.txt' '36847-.txt' '36896-.txt' '7252-.txt']
[2/4] Performing named entity recognition ...
Invoked on Fri Jul 21 21:28:36 CST 2017 with arguments: -loadClassifier C:/Users/Jason Lee/Desktop/cnn-re-tf-master/stanford-ner-jason\classifiers\english.all.3class.distsim.crf.ser.gz -outputFormat tabbedEntities -textFile C:/Users/Jason Lee/Desktop/cnn-re-tf-master\data\orig\7252-.txt
loadClassifier=C:/Users/Jason Lee/Desktop/cnn-re-tf-master/stanford-ner-jason\classifiers\english.all.3class.distsim.crf.ser.gz
textFile=C:/Users/Jason Lee/Desktop/cnn-re-tf-master\data\orig\7252-.txt
outputFormat=tabbedEntities
Loading classifier from C:/Users/Jason Lee/Desktop/cnn-re-tf-master/stanford-ner-jason\classifiers\english.all.3class.distsim.crf.ser.gz ... done [3.9 sec].
CRFClassifier tagged 4376 words in 177 documents at 4635.59 words per second.
[3/4] Linking entities ...
[4/4] Linking predicates ...
Traceback (most recent call last):
File "C:/Users/Jason Lee/Desktop/cnn-re-tf-master/distant_supervision.py", line 718, in
main()
File "C:/Users/Jason Lee/Desktop/cnn-re-tf-master/distant_supervision.py", line 706, in main
positive_examples()
File "C:/Users/Jason Lee/Desktop/cnn-re-tf-master/distant_supervision.py", line 477, in positive_examples
ret = loop(step, doc_id, limit, entities, relations, counter)
File "C:/Users/Jason Lee/Desktop/cnn-re-tf-master/distant_supervision.py", line 406, in loop
relations[(subj, obj)] = search_property(arg1, arg2)
File "C:/Users/Jason Lee/Desktop/cnn-re-tf-master/distant_supervision.py", line 299, in search_property
response = r.json()
AttributeError: 'NoneType' object has no attribute 'json'
It seems that I have problem in the fourth step.It would be helpful if you can provide me with some suggestions.
Thank you for your time!
from cnn-re-tf.
dear,
everything is ok,thank you for your attention! I have got my own articles.
best wishes!
from cnn-re-tf.
Sorry, I had no time to look at your problem. But you found a solution...
from cnn-re-tf.
Related Issues (8)
- [Help] How do I specify the positive class? How to output the prediction results? HOT 5
- How to prepare the source.att file HOT 1
- How do you create the entities.pickle file? HOT 4
- Dataset format and input format for new predictions HOT 4
- distant supervision script exists with error HOT 2
- Did you optimize F1 specifically
- TypeError: object of type 'NoneType' has no len() with #3 settings
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cnn-re-tf.