athalhammer / danker Goto Github PK
View Code? Open in Web Editor NEWCompute PageRank on >3 billion Wikipedia links on off-the-shelf hardware.
License: GNU General Public License v3.0
Compute PageRank on >3 billion Wikipedia links on off-the-shelf hardware.
License: GNU General Public License v3.0
I performed an experiment that compares 2023-11-02.allwiki.links.rank
(computed with 40 iterations) with 2023-10-05.allwiki.links.rank
(computed with 20 iterations) with Spearman ranking coefficient. Despite taking data from two different months, the coefficient turns out to be 0.9972269
.
This gives us enough reason to lower the default number of iterations from 40 to 20.
Hello, Is Possible To Add A Crawler/Spider Web For Other Website For Generate Graph And Calcul PageRank?
Me I Never Found A Software That Generate Graph From A Website.
is possible to add wikivoyage in option of wikis?
Hello,
Thank you for nice tool.
I have one question about how to run danker on links file which already downloaded and processed by danker?
I run "./danker.sh ALL --bigmem" and after few hours it was crushed with memory issue but bziped file of links were created. How I can reuse this file to calculate only PageRank?
Thank you!
Sergei
Hi/Hello, Is Possible To Conserve The Article Name, In Rank File, Because It's For Avoid To Get Name From Wikidata Website.
This repo:
Separate repo:
It appears that the manifest is missing at least one file necessary to build
from the sdist for version 0.5.0. You're in good company, about 5% of other
projects updated in the last year are also missing files.
+ /tmp/venv/bin/pip3 wheel --no-binary danker -w /tmp/ext danker==0.5.0
Looking in indexes: http://10.10.0.139:9191/root/pypi/+simple/
Collecting danker==0.5.0
Downloading http://10.10.0.139:9191/root/pypi/%2Bf/157/6b4c94225e428/danker-0.5.0.tar.gz (11 kB)
ERROR: Command errored out with exit status 1:
command: /tmp/venv/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-wheel-pi634gr2/danker/setup.py'"'"'; __file__='"'"'/tmp/pip-wheel-pi634gr2/danker/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-wheel-pi634gr2/danker/pip-egg-info
cwd: /tmp/pip-wheel-pi634gr2/danker/
Complete output (5 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-wheel-pi634gr2/danker/setup.py", line 3, in <module>
with open("README_PR.md", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'README_PR.md'
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Currently, only the Wikipedia namespace 0 is considered. It could make sense to add further namespaces like Category and File. More information on Wikipedia namespaces: https://en.wikipedia.org/wiki/Wikipedia:Namespace
Hello, I am interested in using danker on my machine (MacBook pro 2018,6-Core i7, 16gb ram) using the provided bash script, but I am encountering the following issue after doing the following steps:
./danker.sh en
After completing the steps above, I encounter the following error:
Am I required to provide the date argument for danker to work?
hi/hello, is possible to add to access other wiki sites than wikipedia? like wikisource wikibooks wiktionary etc...
thanks to add, sick.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.